

Learn how to use the script to perform various operations with UVR.


To use the UVR CLI, navigate to the directory containing in your terminal and execute the script using the following syntax:

python --audio_file <path/to/audio_file> [options]

Replace <path/to/audio_file> with the path to the audio file you want to process and [options] with the necessary arguments. For a detailed list of arguments available for each mode, run:

python <mode> -h

This will display a help message with explanations for each argument.


Info and Debugging

-d, --debugEnable debug logging. Equivalent to --log_level=debug.boolFalseNo
-e, --env_infoPrint environment information and exit.boolFalseNo
-l, --list_modelsList all supported models and exit.boolFalseNo
--log_levelLog level, e.g. info, debug, warning (default: info).strinfoNo

Separation I/O Params

-m, --model_filenameModel to use for separation. Example: -m 2_HP-UVR.pthstrmodel_mel_band_roformer_ep_3005_sdr_11.4360.ckptNo
--output_formatOutput format for separated files, any common format (default: WAV). Example: --output_format=MP3strWAVNo
--output_dirDirectory to write output files (default: <current dir>). Example: --output_dir=/app/separatedstrNoneNo
--model_file_dirModel files directory (default: uvr/tmp/audio-separator-models/). Example: --model_file_dir=/app/modelsstruvr/tmp/audio-separator-models/No

Common Separation Parameters

--invert_spectInvert secondary stem using spectrogram (default: False). Example: --invert_spectboolFalseNo
--normalizationMax peak amplitude to normalize input and output audio to (default: 0.9). Example: --normalization=0.7float0.9No
--single_stemOutput only single stem, e.g. Instrumental, Vocals, Drums, Bass, Guitar, Piano, Other. Example: --single_stem=InstrumentalstrNoneNo
--sample_rateModify the sample rate of the output audio (default: 44100). Example: --sample_rate=44100int44100No

MDX Architecture Parameters

--mdx_segment_sizeLarger consumes more resources, but may give better results (default: 256). Example: --mdx_segment_size=256int256No
--mdx_overlapAmount of overlap between prediction windows, 0.001-0.999. Higher is better but slower (default: 0.25). Example: --mdx_overlap=0.25float0.25No
--mdx_batch_sizeLarger consumes more RAM but may process slightly faster (default: 1). Example: --mdx_batch_size=4int1No
--mdx_hop_lengthUsually called stride in neural networks, only change if you know what you're doing (default: 1024). Example: --mdx_hop_length=1024int1024No
--mdx_enable_denoiseEnable denoising during separation (default: False). Example: --mdx_enable_denoiseboolFalseNo

VR Architecture Parameters

--vr_batch_sizeNumber of batches to process at a time. Higher = more RAM, slightly faster processing (default: 4). Example: --vr_batch_size=16int4No
--vr_window_sizeBalance quality and speed. 1024 = fast but lower, 320 = slower but better quality. (default: 512). Example: --vr_window_size=320int512No
--vr_aggressionIntensity of primary stem extraction, -100 - 100. Typically 5 for vocals & instrumentals (default: 5). Example: --vr_aggression=2int5No
--vr_enable_ttaEnable Test-Time-Augmentation; slow but improves quality (default: False). Example: --vr_enable_ttaboolFalseNo
--vr_high_end_processMirror the missing frequency range of the output (default: False). Example: --vr_high_end_processboolFalseNo
--vr_enable_post_processIdentify leftover artifacts within vocal output; may improve separation for some songs (default: False). Example: --vr_enable_post_processboolFalseNo
--vr_post_process_thresholdThreshold for post_process feature: 0.1-0.3 (default: 0.2). Example: --vr_post_process_threshold=0.1float0.2No

Demucs Architecture Parameters

--demucs_segment_sizeSize of segments into which the audio is split, 1-100. Higher = slower but better quality (default: Default). Example: --demucs_segment_size=256strDefaultNo
--demucs_shiftsNumber of predictions with random shifts, higher = slower but better quality (default: 2). Example: --demucs_shifts=4int2No
--demucs_overlapOverlap between prediction windows, 0.001-0.999. Higher = slower but better quality (default: 0.25). Example: --demucs_overlap=0.25float0.25No
--demucs_segments_enabledEnable segment-wise processing (default: True). Example: --demucs_segments_enabled=FalseboolTrueNo

MDXC Architecture Parameters

--mdxc_segment_sizeLarger consumes more resources, but may give better results (default: 256). Example: --mdxc_segment_size=256int256No
--mdxc_override_model_segment_sizeOverride model default segment size instead of using the model default value. Example: --mdxc_override_model_segment_sizeboolFalseNo
--mdxc_overlapAmount of overlap between prediction windows, 2-50. Higher is better but slower (default: 8). Example: --mdxc_overlap=8int8No
--mdxc_batch_sizeLarger consumes more RAM but may process slightly faster (default: 1). Example: --mdxc_batch_size=4int1No
--mdxc_pitch_shiftShift audio pitch by a number of semitones while processing. May improve output for deep/high vocals. (default: 0). Example: --mdxc_pitch_shift=2int0No

Example: --audio_file "my_song.mp3" --output_format MP3 --output_dir "/path/to/output" --model_filename "2_HP-UVR.pth" --vr_aggression 10