Prof. Dr.-Ing. Dr. rer. nat. h.c. mult. Karlheinz Brandenburg

Senior Professor

Helmholtzbau, Raum H 3530
+49 3677 69-2676 | Fax: +49 3677 69-1255
karlheinz.brandenburg@tu-ilmenau.de

Literaturliste

Anzahl der Treffer: 64
Erstellt: Wed, 27 Mar 2024 23:36:01 +0100 in 0.3453 sec


Schuller, Gerald;
Ultra low delay audio source separation using zeroth-order optimization. - In: 22nd IEEE Statistical Signal Processing Workshop (SSP 2023), (2023), S. 497-501

In this paper, we introduce the "Random Directions" probabilistic optimization method, demonstrating its efficacy in real-time, low-latency signal processing applications. Applied to an ultra-low delay, time-domain, multichannel source separation system, our "Random Directions" is compared with gradient-based method "Trinicon" and frequency domain methods like AuxIVA and FastMNMF. Results indicate that our approach often outperforms Trinicon in terms of the Signal to Interference Ratio (SIR) and presents the least non-linear distortions among all methods, as measured by the Signal to Artifacts Ratio (SAR). This study suggests that probabilistic optimization methods, traditionally perceived as slow, can indeed be effective for real-time applications.



https://doi.org/10.1109/SSP53291.2023.10208066
Schuller, Gerald;
Filterbänke und Audiocodierung : Komprimierung von Audiosignalen mit Python. - Cham : Springer International Publishing, 2023. - 1 Online-Ressource (XI, 146 Seiten) ISBN 978-3-031-19990-5

Einführung -- Filterbänke -- Mit wechselnder Anzahl von Teilbändern -- Prädiktive Kodierung -- Psychoakustische Modelle -- Psychoakustische Modelle und Quantisierung -- Entropiekodierung -- Der Python Perceptual Audio Coder -- Prädiktive verlustfreie Audiokodierung -- Skalierbare verlustfreie Audiokodierung -- Psychoakustischer Vorfilter -- Fazit.



https://doi.org/10.1007/978-3-031-19990-5
Schuller, Gerald;
Low latency time domain multichannel speech and music source separation. - In: Conference record of the Fifty-Fifth Asilomar Conference on Signals, Systems & Computers, (2021), S. 549-553

The Goal is to obtain a simple multichannel source separation with very low latency. Applications can be teleconferencing, hearing aids, augmented reality, or selective active noise cancellation. These real time applications need a very low latency, usually less than about 6 ms, and low complexity, because they usually run on small portable devices. For that we don't need the best separation, but "useful" separation, and not just on speech, but also music and noise. Usual frequency domain approaches have higher latency and complexity. Hence we introduce a novel probabilistic optimization method which we call "Random Directions", which can overcome local minima, applied to a simple time domain unmixing structure, and which is scalable for low complexity. Then it is compared to frequency domain approaches on separating speech and music sources, and using 3D microphone setups.



https://doi.org/10.1109/IEEECONF53345.2021.9723106
Profeta, Renato; Schuller, Gerald
End-to-end learning for musical instruments classification. - In: Conference record of the Fifty-Fifth Asilomar Conference on Signals, Systems & Computers, (2021), S. 1607-1611

Musical instruments classification is a widely studied topic in Music Information Retrieval (MIR) and Signal Processing. The applications of this subject go from indexing of an audio database, automatic transcription, recommender systems, to music search by timbre, music annotation and others. Many different techniques were used along the years using deep neural networks with hand engineered features or learned features [1] [2] [3]. The purpose of this paper is to present Convolutional Neural Network (CNN) based Filter Banks that can generate not only features optimized for classification in the encoded domain but also achieving near perfect reconstruction in the decoder output with similar quality of standard lossy audio codecs. The filter banks are then compared with other commonly used invertible transformations employed as features in classification problems such as Short-time Fourier Transform (STFT) spectrograms and Mel spectrograms using a same simple classifier with a small number of parameters. The idea is that the heavy weight is lifted by the learned features and not the classifier whilst achieving near perfect reconstruction.



https://doi.org/10.1109/IEEECONF53345.2021.9723181
Golokolenko, Oleg; Schuller, Gerald
The method of random directions optimization for stereo audio source separation. - In: Cognitive intelligence for speech processing, (2020), S. 3316-3320

In this paper, a novel fast time domain audio source separation technique based on fractional delay filters with low computational complexity and small algorithmic delay is presented and evaluated in experiments. Our goal is a Blind Source Separation (BSS) technique, which can be applicable for the low cost and low power devices where processing is done in real-time, e.g. hearing aids or teleconferencing setups. The proposed approach optimizes fractional delays implemented as IIR filters and attenuation factors between microphone signals to minimize crosstalk, the principle of a fractional delay and sum beamformer. The experiments have been carried out for offline separation with stationary sound sources and for real-time with randomly moving sound sources. Experimental results show that separation performance of the proposed time domain BSS technique is competitive with State-of-the-Art (SoA) approaches but has lower computational complexity and no system delay like in frequency domain BSS.



https://doi.org/10.21437/Interspeech.2020-1409
Schuller, Gerald; Golokolenko, Oleg
Probabilistic optimization for source separation. - In: Conference record of the Fifty-Fourth Asilomar Conference on Signals, Systems & Computers, (2020), S. 534-538

We present a novel probabilistic Zeroth-Order optimization method, which can handle higher dimensions, and can also be used for fast online optimization, for instance for multichannel source separation. We compared it to the Gradientless Descent (GLD) algorithm on a multichannel source separation task, and found that our method results in faster and better separation (for the 2-channels case). For the multichannel case, only our method resulted in useful separation. We also applied it to separating sources from 3-dimensional microphone arrays, with comparable results.



https://doi.org/10.1109/IEEECONF51394.2020.9443564
Schuller, Gerald;
Building and programming home robots with Raspberry Pi and Python. - Ilmenau. - 1 Online-Ressource (33:40 min)

Gerald will show how to build home robots for fun and education, using the Raspberry Pi one board computer and Python. Examples are: Giving a roomba vaccum robot eyes; and a small 2-legged balancing and walking robot.



https://www.youtube.com/watch?v=NCvKlgJ8A8k
Mimilakis, Stylianos Ioannis; Drossos, Konstantinos; Schuller, Gerald
Unsupervised interpretable representation learning for singing voice separation. - In: 28th European Signal Processing Conference (EUSIPCO 2020), (2020), S. 1412-1416

In this work, we present a method for learning interpretable music signal representations directly from waveform signals. Our method can be trained using unsupervised objectives and relies on the denoising auto-encoder model that uses a simple sinusoidal model as decoding functions to reconstruct the singing voice. To demonstrate the benefits of our method, we employ the obtained representations to the task of informed singing voice separation via binary masking, and measure the obtained separation quality by means of scale-invariant signal to distortion ratio. Our findings suggest that our method is capable of learning meaningful representations for singing voice separation, while preserving conveniences of the the short-time Fourier transform like non-negativity, smoothness, and reconstruction subject to time-frequency masking, that are desired in audio and music source separation.



https://doi.org/10.23919/Eusipco47968.2020.9287352
de Castro Rabelo Profeta, Renato; Schuller, Gerald
Feature-based classification of electric guitar types. - In: Machine learning and knowledge discovery in databases, (2020), S. 478-484

Schuller, Gerald;
Filter banks and audio coding : compressing audio signals using python. - Cham : Springer International Publishing, 2020. - 1 Online-Ressource (XI, 197 p. 72 illus., 49 illus. in color.). - (Springer eBook Collection) ISBN 978-3-030-51249-1

Introduction -- Filter Banks -- With a Changing Number of Subbands -- Predictive Coding -- Psychoacoustic Models -- Psychoacoustic Models and Quantization -- Entropy Coding -- The Python Perceptual Audio Coder -- Predictive Lossless Audio Coding -- Scalable Lossless Audio Coding -- Psycho-Acoustic Pre-Filter -- Conclusion.



https://doi.org/10.1007/978-3-030-51249-1
Golokolenko, Oleg; Schuller, Gerald
Fast time domain stereo audio source separation using fractional delay filters. - In: 147th Audio Engineering Society Convention 2019, (2020), S. 179-183

Profeta, Renato; Schuller, Gerald
Comparison of human and machine recognition of electric guitar types. - In: 147th Audio Engineering Society Convention 2019, (2020), S. 1058-1064

Mimilakis, Stylianos Ioannis; Drossos, Konstantinos; Cano, Estefanía; Schuller, Gerald
Examining the mapping functions of denoising autoencoders in singing voice separation. - In: IEEE ACM transactions on audio, speech, and language processing, ISSN 2329-9304, Bd. 28 (2020), S. 266-278

https://doi.org/10.1109/TASLP.2019.2952013
Golokolenko, Oleg; Schuller, Gerald
A fast stereo audio source separation for moving sources. - In: Conference record of the Fifty-Third Asilomar Conference on Signals, Systems & Computers, (2019), S. 1931-1935

https://doi.org/10.1109/IEEECONF44664.2019.9048652
Schuller, Gerald;
Klimawandel nachgerechnet, Teil 1. - Ilmenau. - 1 Online-Ressource (34:12 min)

Kann man Medienberichten und Wissenschaftlern im Bezug auf den Klimawandel trauen? Naturwissenschften sind so gemacht, dass Ergebnisse nachrechenbar und überprüfbar sind. So lassen sich auch Medienberichte überprüfen. Im Ersten Teil der Reihe beantworten wir die Frage: Lässt sich der beobachtete Anstieg der Kohlendioxid-Konzentration der Atmosphäre durch menschliche Aktivität erklären, und wenn ja, zu welchem Anteil?



https://www.youtube.com/watch?v=e3kzkl5zEfs
Mimilakis, Stylianos Ioannis; Cano, Estefanía; FitzGerald, Derry; Drossos, Konstantinos; Schuller, Gerald
Examining the perceptual effect of alternative objective functions for deep learning based music source separation. - In: Conference record of the Fifty-Second Asilomar Conference on Signals, Systems & Computers, (2018), S. 679-683

https://doi.org/10.1109/ACSSC.2018.8645257
Drossos, Konstantinos; Mimilakis, Stylianos Ioannis; Serdyuk, Dmitriy; Schuller, Gerald; Virtanen, Tuomas; Bengio, Yoshua
MaD TwinNet: Masker-Denoiser architecture with Twin Networks for monaural sound source separation. - In: 2018 International Joint Conference on Neural Networks (IJCNN), ISBN 978-1-5090-6014-6, (2018), insges. 8 S.

https://doi.org/10.1109/IJCNN.2018.8489565
Mimilakis, Stylianos Ioannis; Drossos, Konstantinos; Santos, João F.; Schuller, Gerald; Virtanen, Tuomas
Monaural singing voice separation with skip-filtering connections and recurrent inference of time-frequency mask. - In: 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ISBN 978-1-5386-4658-8, (2018), S. 721-725

https://doi.org/10.1109/ICASSP.2018.8461822
Mimilakis, Stylianos Ioannis; Drossos, Konstantinos; Virtanen, Tuomas; Schuller, Gerald
A recurrent encoder-decoder approach with skip-filtering connections for monaural singing voice separation. - In: Proceedings of MLSP2017, ISBN 978-1-5090-6341-3, (2017), insges. 6 S.

https://doi.org/10.1109/MLSP.2017.8168117
Golokolenko, Oleg; Schuller, Gerald
Investigation of electric network frequency for synchronization of low cost and wireless sound cards. - In: EUSIPCO 2017, ISBN 978-0-9928626-7-1, (2017), S. 693-697

https://doi.org/10.23919/EUSIPCO.2017.8081296
Drossos, Konstantinos; Mimilakis, Stylianos Ioannis; Floros, Andreas; Virtanen, Tuomas; Schuller, Gerald
Close miking empirical practice verification: a source separation approach. - In: 142nd Audio Engineering Society International Convention 2017, (2017), S. 629-637

Abeßer, Jakob; Schuller, Gerald
Instrument-centered music transcription of solo bass guitar recordings. - In: IEEE ACM transactions on audio, speech, and language processing, ISSN 2329-9304, Bd. 25 (2017), 9, S. 1437-1446

https://doi.org/10.1109/TASLP.2017.2702384
Mimilakis, Stylianos Ioannis; Drossos, Konstantinos; Virtanen, Tuomas; Schuller, Gerald
Deep neural networks for dynamic range compression in mastering applications. - In: 140th Audio Engineering Society International Convention 2016, ISBN 978-1-5108-2570-3, (2016), S. 289-296

Schuller, Gerald; Abeßer, Jakob; Kehling, Christian
Parameter extraction for bass guitar sound models including playing styles. - In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ISBN 978-1-4673-6998-5, (2015), S. 404-408

https://doi.org/10.1109/ICASSP.2015.7178000
Abeßer, Jakob; Schuller, Gerald
Instrument-centered music transcription of bass guitar tracks. - In: Semantic audio, (2014), S. 166-175

Gärtner, Daniel; Dittmar, Christian; Aichroth, Patrick; Cuccovillo, Luca; Mann, Sebastian; Schuller, Gerald
Efficient cross-codec framing grid analysis for audio tampering detection. - In: 136th Audio Engineering Society convention 2014, ISBN 978-1-63266-506-5, (2014), S. 306-316

Cano, Estefanía; Schuller, Gerald; Dittmar, Christian
Pitch-informed solo and accompaniment separation towards its use in music education applications. - In: EURASIP journal on advances in signal processing, ISSN 1687-6180, (2014), 23, S. 1-19

https://doi.org/10.1186/1687-6180-2014-23
Neukam, Christian; Nagel, Frederik; Schuller, Gerald; Schnabel, Michael
A MDCT based harmonic spectral bandwidth extension method. - In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, ISBN 978-1-4799-0357-3, (2013), S. 566-570

http://dx.doi.org/10.1109/ICASSP.2013.6637711
Bießmann, Paul; Gärtner, Daniel; Dittmar, Christian; Aichroth, Patrick; Schuller, Gerald; Schnabel, Michael; Geiger, Ralf
Estimating MP3PRO encoder parameters from decoded audio. - In: Informatik 2013 - Informatik angepasst an Mensch, Organisation und Umwelt, (2013), S. 2841-2852

Niehaus, Marco; Esch, Lorenz; Esch, Lorenz *1988-*; Schuller, Gerald;
Parametric mesh reconstruction pipeline from 3D point clouds. - In: ISWCS 2013, ISBN 978-3-8007-3529-7, (2013), S. 512-516

Schöberl, Michael; Keinert, Joachim; Ziegler, Matthias; Seiler, Jürgen; Niehaus, Marco; Schuller, Gerald; Kaup, André; Foessel, Siegfried
Evaluation of a high dynamic range video camera with non-regular sensor. - In: Digital photography IX, ISBN 978-0-8194-9433-7, (2013), S. 86600M-1-86600M-12

Schnabel, Michael; Schubert, Benjamin; Schuller, Gerald
Parametric coding of piano signals. - In: 133rd Audio Engineering Society convention 2012, (2013), S. 863-871

Carôt, Alexander; Schuller, Gerald;
Applying video to low delayed audio streams in bandwidth limited networks. - In: Audio networking, ISBN 978-1-62276-006-0, (2012), S. 104-109

Carôt, Alexander; Schuller, Gerald;
Towards a telematic visual-conducting system. - In: Audio networking, ISBN 978-1-62276-006-0, (2012), S. 99-103

Cano, Estefanía; Dittmar, Christian; Schuller, Gerald;
Efficient implementation of a system for solo and accompaniment separation in polyphonic music. - In: Proceedings of the 20th European Signal Processing Conference (EUSIPCO), 2012, ISBN 978-1-4673-1068-0, (2012), S. 285-289

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6334302
Kramer, Patrick; Abeßer, Jakob; Dittmar, Christian; Schuller, Gerald
A digital waveguide model of the electric bass guitar including different playing techniques. - In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, ISBN 978-1-4673-0045-2, (2012), S. 353-356

http://dx.doi.org/10.1109/ICASSP.2012.6287889
Schnabel, Michael; Werner, Michael; Schuller, Gerald;
Improved error robustness for predictive ultra low delay audio coding. - In: 131st Audio Engineering Society convention 2011, (2012), S. 544-549

Abeßer, Jakob; Lartillot, Olivier; Dittmar, Christian; Eerola, Tuomas; Schuller, Gerald
Modeling musical attributes to characterize ensemble recordings using rhythmic audio features. - In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, ISBN 978-1-4577-0539-7, (2011), S. 189-192

http://dx.doi.org/10.1109/ICASSP.2011.5946372
Schuller, Gerald; Gruhne, Matthias; Friedrich, Tobias
Fast audio feature extraction from compressed audio data. - In: IEEE journal of selected topics in signal processing, ISSN 1941-0484, Bd. 5 (2011), 6, S. 1262-1271

https://doi.org/10.1109/JSTSP.2011.2158802
Cano, Estefanía; Dittmar, Christian; Schuller, Gerald;
Influence of phase, magnitude and location of harmonic components in the perceived quality of extracted solo signals. - In: Semantic audio, ISBN 978-0-937803-81-3, (2011), S. 247-252

Abeßer, Jakob; Dittmar, Christian; Schuller, Gerald;
Automatic recognition and parametrization of frequency modulation techniques in bass guitar recordings. - In: Semantic audio, ISBN 978-0-937803-81-3, (2011), S. 121-128

Abeßer, Jakob; Bräuer, Paul; Lukashevich, Hanna; Schuller, Gerald
Bass playing style detection based on high-level features and pattern similarity. - In: ISMIR 2010, (2010), S. 93-98

Cano, Estefanía; Schuller, Gerald; Schuller, Gerald *1961-*;
Exploring phase information in sound source separation applications. - In: DAFx-10, ISBN 978-3-200-01940-9, (2010), S. 259-272

Abeßer, Jakob; Lukashevich, Hanna; Schuller, Gerald
Feature-based extraction of plucking and expression styles of the electric bass guitar. - In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2010, ISBN 978-1-4244-4295-9, (2010), S. 2290-2293

http://dx.doi.org/10.1109/ICASSP.2010.5495945
Stein, Michael; Abeßer, Jakob; Dittmar, Christian; Schuller, Gerald
Automatic detection of audio effects in guitar and bass recordings. - In: 128th Audio Engineering Society convention 2010, (2010), S. 522-533

Werner, Michael; Schuller, Gerald
An SBR tool for very low delay applications with flexible crossover frequency. - In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2010, ISBN 978-1-4244-4295-9, (2010), S. 353-356

http://dx.doi.org/10.1109/ICASSP.2010.5495854
Werner, Michael; Schuller, Gerald
An enhanced SBR tool for low-delay applications. - In: 127th Audio Engineering Society convention 2009, (2010), S. 874-879

Abeßer, Jakob; Lukashevich, Hanna; Dittmar, Christian; Schuller, Gerald
Genre classification using bass-related high-level features and playing styles. - In: ISMIR 2009, (2009), S. 453-458

Neuendorf, Max; Gournay, Philippe; Multrus, Markus; Lecomte, Jérémie; Bessette, Bruno; Geiger, Ralf; Bayer, Stefan; Fuchs, Guillaume; Hilpert, Johannes; Rettelbach, Nikolaus
A novel scheme for low bitrate unified speech and audio coding - MPEG RM0. - In: 126th Audio Engineering Society convention 2009, (2009), S. 1142-1154

Neuendorf, Max; Gournay, Philippe; Multrus, Markus; Lecomte, Jeremie; Bessette, Bruno; Geiger, Ralf; Bayer, Stefan; Fuchs, Guillaume; Hilpert, Johannes; Rettelbach, Nikolaus; Salami, Redwan; Schuller, Gerald; Lefebvre, Roch; Grill, Bernhard
Unified speech and audio coding scheme for high quality at low bitrates. - In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, ISBN 978-1-4244-2353-8, (2009), S. 1-4

http://dx.doi.org/10.1109/ICASSP.2009.4959505
Wabnik, Stefan; Schuller, Gerald; Krämer, Ferenc
An error robust ultra low delay audio coder using an MA prediction model. - In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, ISBN 978-1-4244-2353-8, (2009), S. 5-8

http://dx.doi.org/10.1109/ICASSP.2009.4959506
Arnold, Mirko; Schuller, Gerald
A parametric instrument codec for very low bitrates. - In: 125th Audio Engineering Society convention 2008, (2008), S. 427-433

Kraemer, Ferenc; Schuller, Gerald
Graceful degradation for digital radio mondiale (DRM). - In: 125th Audio Engineering Society convention 2008, (2008), S. 589-595

Friedrich, Tobias; Gruhne, Matthias; Schuller, Gerald
A fast feature extraction system on compressed audio data. - In: 124th Audio Engineering Society convention 2008, (2008), S. 1383-1390

Friedrich, Tobias; Gruhne, Matthias; Schuller, Gerald
Subband conversion for feature extraction from compressed audio. - In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2008, ISBN 978-1-4244-1483-3, (2008), S. 217-220

http://dx.doi.org/10.1109/ICASSP.2008.4517585
Gruhne, Matthias; Dittmar, Christian; Gärtner, Daniel; Schuller, Gerald
An evaluation of pre-processing algorithms for rhythmic pattern analysis. - In: 125th Audio Engineering Society convention 2008, (2008), S. 581-588

Yokotani, Yoshikazu; Geiger, Ralf; Schuller, Gerald; Oraintara, Soontorn; Rao, K. Ramamohan
Lossless audio coding using the IntMDCT and rounding error shaping. - In: IEEE transactions on audio, speech and language processing, ISSN 1558-7924, Bd. 14 (2006), 6, S. 2201-2211

http://dx.doi.org/10.1109/TASL.2006.872613
Wabnik, Stefan; Schuller, Gerald; Krämer, Ulrich; Hirschfeld, Jens
Frequency warping in low delay audio coding. - In: 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, (2005), S. III-181-III-184

https://doi.org/10.1109/ICASSP.2005.1415676
Wabnik, Stefan; Schuller, Gerald; Hirschfeld, Jens; Krämer, Ulrich
Packet loss concealment in predictive audio coding. - In: 2005 Workshop on Applications of Signal Processing to Audio and Acoustics proceedings (WASPAA), (2005), S. 227-230

Schuller, Gerald; Kovačeviâc, Jelena; Masson, Francois; Goyal, Vivek K.
Robust low-delay audio coding using multiple descriptions. - In: IEEE transactions on speech and audio processing, ISSN 1558-2353, Bd. 13 (2005), 5, S. 1014-1024

http://dx.doi.org/10.1109/TSA.2005.853205
Brandenburg, Karlheinz; Schuller, Gerald
Komprimierung. - In: Taschenbuch der Medieninformatik, (2005), S. 57-77

Klier, Juliane; Schuller, Gerald; Haardt, Martin; Hennhöfer, Marko
A new approach for channel equalization without guard interval using polyphase matrices. - In: PIMRC 2005, ISBN 978-3-8007-2909-8, (2005), insges. 5 S.

Kahrs, Mark; Brandenburg, Karlheinz
Applications of digital signal processing to audio and acoustics. - Boston [u.a.] : Kluwer Acad. Publ.. - XXXI, 545 S.. - (Kluwer international series in engineering and computer science ; 437) ISBN 0-7923-8130-0
Includes bibliographical references and index

Brandenburg, Karlheinz;
Ein Beitrag zu den Verfahren und der Qualitätsbeurteilung für hochwertige Musikcodierung. - Erlangen, 1989. - 198 SErlangen-Nürnberg, Univ., techn. Fak., Diss. 1989