This script batch‑cleans noisy speech recordings with ffmpeg using simple, reliable filters tuned for ASR (e.g., faster‑whisper). By default it REQUIRES RNNoise (arnndn) and will try to auto‑discover or download a model. You can opt‑in to fallback filters with `--allow-fallback`.
## Install
- Required: ffmpeg. Most distros: `sudo pacman -S ffmpeg` or `sudo apt install ffmpeg`.
- Recommended: ffmpeg with `arnndn` filter and an RNNoise model file (e.g., from Mozilla RNNoise community models). The script will auto-detect common model locations or download one via `Bash/get_rnnoise_model.sh`. You can pass a specific model with `-m /path/to/model.nn`.
Make executable:
```bash
chmod +x Bash/clean_audio.sh
```
## Quick start
- Single file, default ASR preset (16k mono, denoise, high‑pass, limiter):
- The cleaner requires RNNoise by default. To allow non-ML fallback filters (afftdn), add `--allow-fallback`.
- The script uses advanced filter settings when available (e.g., afftdn with `md`). If your ffmpeg build lacks these options, it will error with guidance. Add `--no-advanced` (or `--compat`) to avoid such params.
- Podcast preset (adds dynamics and loudness leveling):
-m, --model PATH RNNoise model file for arnndn; falls back to afftdn if unavailable.
--no-ml Do not use arnndn even if model is provided; use afftdn.
--preset NAME asr (default) | podcast | aggressive
-j, --jobs N Parallel jobs for directory mode (default: 1).
-f, --force Overwrite outputs if they exist.
-q, --quiet Reduce ffmpeg logging noise.
--lowpass FREQ Optional low-pass cutoff (e.g., 8000). Disabled by default.
--suffix SUF Suffix for output basename (default: _clean).
```
## Designed for ASR (faster‑whisper)
Default output format is mono, 16 kHz, PCM 16‑bit WAV—ideal for most Whisper/faster‑whisper pipelines. You can feed the cleaned files directly into your transcription step.
If you prefer FLAC to save space without quality loss:
- If your ffmpeg lacks `arnndn`, you can install a newer build or keep the fallback (afftdn works fine for many cases). - If your ffmpeg is missing features, you can use the helper: