Spaces:
Sleeping
Apply for community grant: Academic project
MIDI-AudioLDM is a MIDI-conditioned text-to-audio model based on the open-source project AudioLDM. The model has been conditioned using the ControlNet architecture and has been developed within Hugging Face’s Diffusers framework. Once trained, MIDI-AudioLDM accepts a MIDI file and a text prompt as input and returns an audio file, which is an interpretation of the MIDI based on the given text description. This enables detailed control over different musical aspects such as notes, mood and timbre.
The project is being developed as part of a Master's Thesis in Artificial Intelligence Research at UNED and will be presented within the Project Area at Sónar+D 2023. It is an ongoing research and the checkpoint will soon be replaced by a more stable version of the model.
very cool @lauraibnz , could you please add some examples?
very cool! I meant also to add examples to your interface
https://gradio.app/docs/#interface:~:text=will%20be%20displayed.-,examples,-list%5BAny%5D%20%7C%20list you can also cache_examples =True
btw, what kind of prompts would work here? do you have a dataset of examples?
I will add more examples soon as I might replace the model with an updated version the following days. This model uses cvssp/audioldm-m-full
as a starting point and has been trained and fine-tuned with audio embeddings rather than prompts. At inference it does use prompt embeddings and accepts any text as input, just as AudioLDM does. However, the fine-tuning has been carried out using a subset of datasets Slakh and URMP, so more musical input works best.
I added some examples and a short description for each parameter.
It would be very useful for me if a GPU was assigned to this space, as I will be showing it as a demo of my academic project at Sónar+D during the following days. I will continue to update the space and checkpoint as well as submitting a PR to the diffusers code once I make some final changes. Thank you!
hi
@lauraibnz
, we assigned a T4 with 10 hours sleeping time, if no requests are made the Space go to sleep, and if you need it again you need to restart.
Please share about Hugging Face and the grant on the Sónar+D event!! 🙏 we'd appreciate, also on social media, thanks
ps. Sónar+D is a great event! congrats and good luck
yes It's active!
Hi @radames , I have been writing my Master's thesis about this project during the last months, so I haven't been using the space that often. However, the 23rd of September I will be giving a talk/workshop about it at Volumens Festival in Spain, and the following week of the 25th I will be presenting the thesis at my university. For those weeks it would be very nice to go back to a longer sleep time period in case it is possible.
Thanks a lot! Best,
Laura
hi @lauraibnz sorry for the delay, I don't think I can change the sleep time, But you can always restart the Space and everytime you interact with it, it resets the timer, or if other people are interacting with it, it won't sleep.