testsAndMisc/linux_configuration/scripts/misc/testsAndMisc-bash/README_clean_audio.md

114 lines
4.4 KiB
Markdown
Raw Normal View History

2025-10-12 18:57:55 +02:00
# clean_audio.sh — automatic speech cleaning (FFmpeg)
This script batchcleans noisy speech recordings with ffmpeg using simple, reliable filters tuned for ASR (e.g., fasterwhisper). By default it REQUIRES RNNoise (arnndn) and will try to autodiscover or download a model. You can optin to fallback filters with `--allow-fallback`.
## Install
- Required: ffmpeg. Most distros: `sudo pacman -S ffmpeg` or `sudo apt install ffmpeg`.
- Recommended: ffmpeg with `arnndn` filter and an RNNoise model file (e.g., from Mozilla RNNoise community models). The script will auto-detect common model locations or download one via `Bash/get_rnnoise_model.sh`. You can pass a specific model with `-m /path/to/model.nn`.
Make executable:
```bash
chmod +x Bash/clean_audio.sh
```
## Quick start
- Single file, default ASR preset (16k mono, denoise, highpass, limiter):
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
```bash
Bash/clean_audio.sh path/to/file.wav
```
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
This produces `path/to/file_clean.wav`.
- Whole folder, 4 parallel jobs, output to `cleaned/`:
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
```bash
Bash/clean_audio.sh path/to/folder -O cleaned -j 4
```
- Use an RNNoise model explicitly (if your ffmpeg has arnndn):
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
```bash
Bash/clean_audio.sh input.wav -m models/rnnoise_model.nn
```
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
If you omit `-m`, the script will look in common locations; if not found, it will attempt a download via `Bash/get_rnnoise_model.sh`.
Advanced options and compatibility:
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
- The cleaner requires RNNoise by default. To allow non-ML fallback filters (afftdn), add `--allow-fallback`.
- The script uses advanced filter settings when available (e.g., afftdn with `md`). If your ffmpeg build lacks these options, it will error with guidance. Add `--no-advanced` (or `--compat`) to avoid such params.
- Podcast preset (adds dynamics and loudness leveling):
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
```bash
Bash/clean_audio.sh input.wav --preset podcast
```
## Options
```text
Usage: clean_audio.sh <input-file|input-dir> [options]
Options:
-O, --out-dir DIR Output directory (default: alongside input file).
-e, --ext EXT Output extension/container: wav|flac (default: wav).
-m, --model PATH RNNoise model file for arnndn; falls back to afftdn if unavailable.
--no-ml Do not use arnndn even if model is provided; use afftdn.
--preset NAME asr (default) | podcast | aggressive
-j, --jobs N Parallel jobs for directory mode (default: 1).
-f, --force Overwrite outputs if they exist.
-q, --quiet Reduce ffmpeg logging noise.
--lowpass FREQ Optional low-pass cutoff (e.g., 8000). Disabled by default.
--suffix SUF Suffix for output basename (default: _clean).
```
## Designed for ASR (fasterwhisper)
Default output format is mono, 16 kHz, PCM 16bit WAV—ideal for most Whisper/fasterwhisper pipelines. You can feed the cleaned files directly into your transcription step.
If you prefer FLAC to save space without quality loss:
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
```bash
Bash/clean_audio.sh input.wav -e flac -O cleaned
```
## Presets
- asr (default): light, ASRfriendly cleanup; prevents clipping.
- podcast: adds gentle dynamics and approximate loudness normalization (singlepass `loudnorm`).
- aggressive: heavier gate/dynamics; can suppress background more, but may slightly hurt ASR accuracy—use sparingly.
## Tips
- If you see artifacts from RNNoise, try without a model (uses `afftdn`), or add a lowpass (e.g., `--lowpass 8000`).
- For extremely boomy bar recordings, raise highpass by editing `HIGHPASS` in the script or add `--lowpass`.
2026-02-20 01:17:53 +01:00
- If your ffmpeg lacks `arnndn`, you can install a newer build or keep the fallback (afftdn works fine for many cases). - If your ffmpeg is missing features, you can use the helper:
2025-10-12 18:57:55 +02:00
```bash
chmod +x Bash/install_ffmpeg_with_arnndn.sh
Bash/install_ffmpeg_with_arnndn.sh
```
2026-02-20 01:17:53 +01:00
2025-10-12 18:57:55 +02:00
It will suggest distro options or build FFmpeg from source with `--enable-librnnoise`.
RNNoise model downloader helper:
```bash
chmod +x Bash/get_rnnoise_model.sh
Bash/get_rnnoise_model.sh --yes
```
This saves a model into `Bash/models/` which the cleaner will auto-discover.
## Troubleshooting
- “arnndn not available”: Your ffmpeg wasnt built with it. The script will use `afftdn` instead.
- Output sounds thin: lower the highpass (edit `HIGHPASS=80` in script to `60`) or remove lowpass.
- Level too low/high: choose the `podcast` preset for auto leveling, or add your own `loudnorm` in post.
## License
This helper script is provided under the repositorys LICENSE.