I just tried an audio demultiplexer ML model called demucs. Amazing how a transformer model can separate drums, bass, and vocals from a music track! 🎶 Not the highest audio quality but still impressive separation.

It’s available under the permissive MIT license.

Thanks to Trevor Cox for mentioning it in his CLIP dataset project. More details in my GitHub repo for curious experimenters.