add bossa nova electric bass
Instruction-tuned text-to-music editing model. Given an input audio and a natural-language prompt, ES-Instruct performs targeted ADD or REMOVE edits while preserving unrelated structure.
KTH Royal Institute of Technology - Master's Thesis
Mauro Luzzatto
Epidemic Sound
Ana Tanevska
KTH Royal Institute of Technology
All of these samples are from the MARI evaluation dataset, and were used for the model listening test.
ES-Base is Epidemic Sound's non-instruction-tuned base model, included as a reference baseline. SAO-Instruct is the state-of-the-art in general short form audio instruction editing.
add bossa nova electric bass
include full acoustic drumkit
insert clean electric guitar
layer piano, organ
Include reggae Saxophone
remove bass guitar
delete tomtom, cymbal, hihat, snare, kick
mute clean electric guitar, guitars
minus grand piano, electric piano
Omit blues Lead male vocal
add orchestra acoustic pianos, snares, kick drum
add heavy metal distorted electric guitar and drum kit
add pop piano
add rock drum kit
add soft and calm drums
remove bass, piano, synth
remove bass, guitar, keyboard
remove vocals
remove drums
remove keyboards
add drums, piano
add drums
remove drums, vocals
remove piano
Eliminate Electronic Synthesizer lead, Synthesizer lead
Both the synth and the piano were removed.
remove acoustic guitar, electric guitar
Instead of removing the guitars, the piano was removed.
remove acoustic guitar
Both the guitar and the piano were removed.