Variational Autoencoder (VAE) for Musical Instruments

Project description

Deep learning has some big successes generating artificial images, for instance of faces, using "Generative Adverserial Networks" or "Variational Autoencoders" (VAE), so-called "Deep Fakes". A simple example program for handwritten images can be found here:https://github.com/kvfrans/variational-autoencoder

The goal of this project is to use the VAE Network to generate musical instrument sounds, similar to "Nsynth".https://magenta.tensorflow.org/nsynth-instrument. For this, the VAE is trained on musical instrument sounds from the IDMT Musical Instruments Database. A suitable training set has to be chosen and tested, and a good set of hyper-parameters (the dimensionality of the network) has to be found. Then a psycho-acoustic similarity measure based on our psycho-acoustic pre- and post-filters has the be used and tested for the VAE network (for the "generation loss"), and compared to other similarity measures. Supervision: Prof. Dr.-Ing. G. Schuller

Backback