mirror of https://github.com/kuhyx/testsAndMisc.git synced 2026-07-04 18:03:07 +02:00

History

Krzysztof kuhy Rudnicki ee27d10fef Reduce per-file-ignores by fixing lint violations across codebase Fix ruff violations in ~15 source files and ~60+ test files to minimize per-file-ignores in pyproject.toml. Remaining ignores are justified with comments explaining why each suppression is necessary. Source fixes: FBT003 (keyword args), S310 (URL validation), SLF001 (private access), T201 (print→logging), C901 (complexity), E501 (line length), E402 (import order). Test fixes: SIM117 (combined with), FBT (boolean args), PERF203 (try in loop), S310/S607 (URLs/executables), E402/E501 (imports/lines), S108 (tmp paths), PLR0913 (too many args), ARG (unused args), ANN (type annotations), RUF059 (unused unpacked vars), PT019 (fixture naming). Remaining per-file-ignores (with justifications): - Tests: ARG, D, PLC0415, PLR2004, S101, SLF001 - music_gen sources: PLC0415 (heavy ML lazy imports) - moviepy_showcase: PLC0415 (circular dependency) - generate_images: PLR0913 (matplotlib helpers need many params) - praca_magisterska_video: E501, E402 (long paths, mpl.use)		2026-03-25 18:58:05 +01:00
..
tests	Reduce per-file-ignores by fixing lint violations across codebase	2026-03-25 18:58:05 +01:00
__init__.py	Add local AI music generator using Meta's MusicGen	2025-12-04 20:43:44 +01:00
_music_generation.py	fix: resolve all pre-commit hook failures after file splits	2026-03-18 22:20:05 +01:00
_music_speech.py	fix: resolve all pre-commit hook failures after file splits	2026-03-18 22:20:05 +01:00
music_generator.py	Reduce per-file-ignores by fixing lint violations across codebase	2026-03-25 18:58:05 +01:00
README.md	music_gen: add segmented generation, Bark vocals, and song mixing	2025-12-04 21:26:52 +01:00
run.sh	feat: added run sh and makefile scripts	2026-02-22 22:00:50 +01:00
setup.sh	Add local AI music generator using Meta's MusicGen	2025-12-04 20:43:44 +01:00

README.md

MusicGen - Local AI Music & Speech Generator

Generate music and speech/vocals from text prompts using Meta's MusicGen and Suno's Bark.

Features

Music Generation: Create instrumental music from text descriptions (MusicGen)
Long Audio Support: Generate music of any length via automatic segmentation with crossfading
Speech/Vocals: Generate speech and singing with Bark (optional)
CUDA Optimized: Auto-detects GPU and selects best model for your VRAM
No API Keys: Runs 100% locally on your hardware

Quick Start

# 1. Run the setup script (creates venv, installs dependencies)
cd python_pkg/music_gen
./setup.sh

# 2. Activate the virtual environment
source venv/bin/activate

# 3. Generate music!
python music_generator.py "upbeat electronic dance music with synths"

Usage

Music Generation (MusicGen)

# Basic usage
python music_generator.py "jazz piano with soft drums"

# Set duration (any length supported via segmentation)
python music_generator.py --duration 60 "epic orchestral soundtrack"

# Generate a full 3-minute track
python music_generator.py --duration 180 "ambient electronic music"

# Use smaller/faster model
python music_generator.py --model small "rock guitar riff"

# Use larger/better quality model (needs 12GB+ VRAM)
python music_generator.py --model large "ambient electronic"

Speech/Vocals Generation (Bark)

# First install Bark (not included in base setup)
pip install git+https://github.com/suno-ai/bark.git

# Generate speech
python music_generator.py --speech "Hello, how are you today?"

# Use different voice
python music_generator.py --speech --voice v2/en_speaker_3 "Welcome!"

# Generate singing
python music_generator.py --speech "♪ La la la, I love to sing ♪"

# With laughter and expression
python music_generator.py --speech "That's so funny! [laughter] I can't believe it."

Bark special tokens:

[laughter], [laughs], [sighs], [gasps] - expressions
[music], [clears throat] - sounds
♪ - singing
... or — - hesitations

Available voices: v2/en_speaker_0 through v2/en_speaker_9

Interactive Mode

python music_generator.py --interactive

In interactive mode:

Type prompts to generate music
:d 15 - Set duration to 15 seconds
:h - Show example prompts
:q - Quit

Model Sizes (Auto-Selected by VRAM)

Model	Size	VRAM	Quality	Speed
small	~500MB	3GB+	Good	Fast
medium	~3.3GB	8GB+	Better	Medium
large	~6.5GB	12GB+	Best	Slow

Requirements

Python 3.10+
NVIDIA GPU with CUDA (required for NVIDIA systems)
Apple Silicon supported via MPS
8GB+ VRAM recommended for best results

Output

Generated audio files are saved to ./output/ as WAV files with timestamps.

Example Prompts

"upbeat electronic dance music with heavy bass"
"calm acoustic guitar melody with soft percussion"
"epic orchestral soundtrack with dramatic strings"
"lo-fi hip hop beats for studying"
"80s synthwave with retro vibes"
"jazz piano trio with upright bass"
"ambient electronic music for relaxation"
"rock guitar riff with drums"
"classical piano sonata in minor key"

Troubleshooting

Out of Memory

Try --model small for lower VRAM usage
Reduce duration with --duration 10
Close other GPU applications

Slow Generation

Make sure GPU is detected (check output at startup)
Use --model small for faster generation
Reduce duration

No Sound / Corrupted File

Check if scipy is installed: pip install scipy
Try a different audio player (VLC recommended)

CUDA Not Available

If you see "NVIDIA GPU detected but CUDA is not available":

pip install torch --index-url https://download.pytorch.org/whl/cu121