More details
The model will generate a short music extract based on the description you provided.
The model can generate or edit up to 30 seconds of audio in one pass.
The model was trained with description from a stock music catalog, descriptions that will work best
should include some level of details on the instruments present, along with some intended use case
(e.g. adding "perfect for a commercial" can somehow help).
You can optionally provide a reference audio from which the model will elaborate an edited version
based on the text description, using MelodyFlow's regularized latent inversion.
WARNING: Choosing long durations will take a longer time to generate.
Available models are:
- facebook/melodyflow-t24-30secs (1B)
See github.com/facebookresearch/audiocraft
for more details.