Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow that is awesome. Hopefully the first feature they'll add will be to spit out midi rather than wavs (or ML that can decompose wavs to midi)


I'm surprised that ML-based MIDI generation hasn't been done quite a bit earlier and pretty well tbh. Sound is incredibly complex, but sequences of chords and notes aligned with music theory and genre conventions have well defined and easily imitated patterns. I guess part of the reason MIDI generation hasn't been a major research focus is that toy scripts get you a lot of the way.

These are pretty good, even the vocals are OK, although I wonder how much if parts of the prompt like "it may be used during a festival during two songs for a buildup" are actually adding much to the mix and the music's association with the painting descriptions is as loose as I'd expect.


The old Biaxial-RNN by Daniel D. Johnson generates very good output for MIDI music, albeit limited to a single keyboard-like instrument. It's available at https://github.com/danieldjohnson/biaxial-rnn-music-composit... and AIUI there's a GitHub fork that forward-ports it to up-to-date versions of Python (3.x series) and Theano.

Transformer models are quite a bit more computationally intensive than the LSTM this used, and GPT adds attention mechanisms; but the basic approach is loosely comparable and the LSTM model can be easily trained on a single machine.


Know of anything like this that can do EDM/House music?


This is literally trained on and is processing at the level of the audio bitstream. So, getting it to 'spit out midi' would be no different than the current task of taking a full, mixed audio track and generating MIDI from it (which isn't easy). This is using a transformer architecture directly on tokenization of .wav audio. There is no underlaying stage of 'instrumentation' like a MIDI track is before it gets synthesized into an audio stream.


Which is a Bad Thing, because it limits the ability to use it as an educational tool by anyone who wants to learn or modify the music.


That's a very hard thing for them to add given the way the model is trained. It looks more like a 1-D version of an image generator than past attempts at music AI, which generated MIDI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: