MelodyFlow

This is your private demo for MelodyFlow, A fast text-guided music generation and editing model based on a single-stage flow matching DiT presented at: ["High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching"] (https://huggingface.co/papers/2407.03648)

Examples

Model	Input Text	ODE Solver	Inference steps	Target Flow step	Regularize	Regularization Strength	Duration	File or Microphone

More details

The model will generate a short music extract based on the description you provided. The model can generate or edit up to 30 seconds of audio in one pass.

The model was trained with description from a stock music catalog, descriptions that will work best should include some level of details on the instruments present, along with some intended use case (e.g. adding "perfect for a commercial" can somehow help).

You can optionally provide a reference audio from which the model will elaborate an edited version based on the text description, using MelodyFlow's regularized latent inversion.

WARNING: Choosing long durations will take a longer time to generate.

Available models are:

facebook/melodyflow-t24-30secs (1B)

See github.com/facebookresearch/audiocraft for more details.

Input Audio

Controls Data

Output Audio

Output Labels