Examining the mapping functions of denoising autoencoders in singing voice separation. - In: IEEE ACM transactions on audio, speech, and language processing, ISSN 2329-9304, Bd. 28 (2020), S. 266-278
Instrument-centered music transcription of solo bass guitar recordings. - In: IEEE ACM transactions on audio, speech, and language processing, ISSN 2329-9304, Bd. 25 (2017), 9, S. 1437-1446
Pitch-informed solo and accompaniment separation towards its use in music education applications. - In: EURASIP journal on advances in signal processing, ISSN 1687-6180, (2014), 23, S. 1-19
Fast audio feature extraction from compressed audio data. - In: IEEE journal of selected topics in signal processing, ISSN 1941-0484, Bd. 5 (2011), 6, S. 1262-1271
Multirate systems and applications. - In: EURASIP journal on advances in signal processing, ISSN 1687-6180, (2007), 41658, S. 1-3
Lossless audio coding using the IntMDCT and rounding error shaping. - In: IEEE transactions on audio, speech and language processing, ISSN 1558-7924, Bd. 14 (2006), 6, S. 2201-2211
Robust low-delay audio coding using multiple descriptions. - In: IEEE transactions on speech and audio processing, ISSN 1558-2353, Bd. 13 (2005), 5, S. 1014-1024
Perceptual audio coding using adaptive pre- and post-filters and lossless compression. - In: IEEE transactions on speech and audio processing, ISSN 1558-2353, Bd. 10 (2002), 6, S. 379-390
This paper proposes a versatile perceptual audio coding method that achieves high compression ratios and is capable of low encoding/decoding delay. It accommodates a variety of source signals (including both music and speech) with different sampling rates. It is based on separating irrelevance and redundancy reductions into independent functional units. This contrasts traditional audio coding where both are integrated within the same subband decomposition. The separation allows for the independent optimization of the irrelevance and redundancy reduction units. For both reductions, we rely on adaptive filtering and predictive coding as much as possible to minimize the delay. A psycho-acoustically controlled adaptive linear filter is used for the irrelevance reduction, and the redundancy reduction is carried out by a predictive lossless coding scheme, which is termed weighted cascaded least mean squared (WCLMS) method. Experiments are carried out on a database of moderate size which contains mono-signals of different sampling rates and varying nature (music, speech, or mixed). They show that the proposed WCLMS lossless coder outperforms other competing lossless coders in terms of compression ratios and delay, as applied to the pre-filtered signal. Moreover, a subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.
Modulated filter banks with arbitrary system delay: efficient implementations and the time-varying case. - In: IEEE transactions on signal processing, ISSN 1941-0476, Bd. 48 (2000), 3, S. 737-748
We present a new method for the design and implementation of modulated filter banks with perfect reconstruction. It is based on the decomposition of the analysis and synthesis polyphase matrices into a product of two different types of simple matrices, replacing the polyphase filtering part in a modulated filter bank. Special consideration is given to cosine-modulated as well as time-varying filter banks. The new structure provides several advantages. First of all, it allows an easy control of the input-output system delay, which can be chosen in single steps of input sampling rate, independent of the filter length. This property can be used in audio coding applications to reduce pre-echoes. Second, it results in a structure that is nearly twice as efficient as performing the polyphase filtering directly. Perfect reconstruction is a structurally inherent feature of the new formulation, even for nonlinear operations or time-varying coefficients. Hence, the structure is especially suited for the design of time-varying filter banks where both the number of bands as well as the prototype filters can be changed while maintaining perfect reconstruction and critical sampling. Further, a proof of effective completeness is given, and the design of equal magnitude-response analysis and synthesis filter banks is described. Filter design can be performed by nonconstrained optimization of the matrix coefficients according to a given cost function. Design and audio-coding application examples are given to show the performance of the new filter bank.
Filterbänke mit niedriger Systemverzögerungszeit und exakter Rekonstruktion :
Low delay filter banks with perfect reconstruction. - In: Frequenz, ISSN 2191-6349, Bd. 50 (1996), 9/10, S. 228-236