The goal of this Masters project is to design and evaluate a predictor using machine learning (DNN, CNN). This predictor should as precisely as possible predict for an given discrete audio signal the subsequent audio-sample or block of audio-samples. The predictor should be evaluated in a lossless audio-coder scheme where it is expected that the new predictor outperforms state-of-the art designs. Furthermore, using a feedback loop, it should be investigated which artificial audio signals the predictor can produce. This is also known as a "Deep Generative Model". It is part of the project to chose a proper training set and evaluate the predictor on a test-set. The choice of the neural network architecture and its dimensions and the comparison of different designs is the crucial point. Programming language is Python with its neural network library Pytorch. Starting point are articles about the so-called "WaveNet" by Google, and "FFTNet".


Prof. Gerald Schuller (TU Ilmenau), Dr.-Ing. Sascha Disch, Dipl.-Math. Andreas Niedermeier (Fraunhofer IIS)