MelodyFlow

This is your private demo for MelodyFlow, A fast text-guided music generation and editing model based on a single-stage flow matching DiT presented at: ["High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching"] (https://huggingface.co/papers/2407.03648)

Model
ODE Solver
2 128
1 30
0 1
0 1
Examples
Model Input Text ODE Solver Inference steps Target Flow step Regularize Regularization Strength Duration File or Microphone

More details

The model will generate a short music extract based on the description you provided. The model can generate or edit up to 30 seconds of audio in one pass.

The model was trained with description from a stock music catalog, descriptions that will work best should include some level of details on the instruments present, along with some intended use case (e.g. adding "perfect for a commercial" can somehow help).

You can optionally provide a reference audio from which the model will elaborate an edited version based on the text description, using MelodyFlow's regularized latent inversion.

WARNING: Choosing long durations will take a longer time to generate.

Available models are:

  1. facebook/melodyflow-t24-30secs (1B)

See github.com/facebookresearch/audiocraft for more details.