GUI toolkit for various audio diffusion operations. Run/Install using start_windows.bat or start_unix.sh.
This will create a venv and install the requirements.
This is mainly tested on Windows. OSX, Linux and WSL SHOULD work but there could be uncaught errors.
WSL/Linux users might need to sudo apt install nvidia-cudnn
Linux users can install ffmpeg by updating packages and running sudo apt install ffmpeg
OSX users can install by using homebrew: brew install ffmpeg
for windows: start_windows.bat will automatically check for ffmpeg and download icedterminals MSI package.
https://github.com/icedterminal/ffmpeg-installer.
Requires Python 3.10
Includes additional features such as:
- Model merging
- Instant input load for variation
- Batch looping
- Local model training
- Input Wave Generation
- Model Trimming
Features are made accessible through an extension API; look in the extensions folder for examples!
- Local Model Training is not recommended for systems with less than 8GB of VRAM.
- Please import your models through the included model importer. This ensures the models are trimmed and have the correct format for the filenames for use in auto-complete.
launch_script.py
can be ran directly from your python interpreter if you do not want to use a venv.
Various libs and snippets taken from:
https://github.com/Harmonai-org/sample-generator
https://github.com/sudosilico/sample-diffusion
Special Thanks to:
https://github.com/twobob
https://github.com/zqevans
https://github.com/drscotthawley