Augmented/mixed reality audio for hearables: sensing, control, and rendering. - In: IEEE signal processing magazine, ISSN 1558-0792, Bd. 39 (2022), 3, S. 63-89
Augmented or mixed reality (AR/MR) is emerging as one of the key technologies in the future of computing. Audio cues are critical for maintaining a high degree of realism, social connection, and spatial awareness for various AR/MR applications, such as education and training, gaming, remote work, and virtual social gatherings to transport the user to an alternate world called the metaverse. Motivated by a wide variety of AR/MR listening experiences delivered over hearables, this article systematically reviews the integration of fundamental and advanced signal processing techniques for AR/MR audio to equip researchers and engineers in the signal processing community for the next wave of AR/MR.
Comparing the effect of different open headphone models on the perception of a real sound source. - In: 150th Audio Engineering Society Convention 2021, (2021), S. 389-398
Creation of auditory augmented reality using a position-dynamic binaural synthesis system - technical components, psychoacoustic needs, and perceptual evaluation. - In: Applied Sciences, ISSN 2076-3417, Bd. 11 (2021), 3, 1150, insges. 20 S.
For a spatial audio reproduction in the context of augmented reality, a position-dynamic binaural synthesis system can be used to synthesize the ear signals for a moving listener. The goal is the fusion of the auditory perception of the virtual audio objects with the real listening environment. Such a system has several components, each of which help to enable a plausible auditory simulation. For each possible position of the listener in the room, a set of binaural room impulse responses (BRIRs) congruent with the expected auditory environment is required to avoid room divergence effects. Adequate and efficient approaches are methods to synthesize new BRIRs using very few measurements of the listening room. The required spatial resolution of the BRIR positions can be estimated by spatial auditory perception thresholds. Retrieving and processing the tracking data of the listener’s head-pose and position as well as convolving BRIRs with an audio signal needs to be done in real-time. This contribution presents work done by the authors including several technical components of such a system in detail. It shows how the single components are affected by psychoacoustics. Furthermore, the paper also discusses the perceptive effect by means of listening tests demonstrating the appropriateness of the approaches.
Minimum BRIR grid resolution for interactive position changes in dynamic binaural synthesis. - In: 148th Audio Engineering Society International Convention 2020, (2020), S. 660-669
Creating auditory illusions with binaural technology. - In: The technology of binaural understanding, (2020), S. 623-663
It is pointed out that beyond reproducing the physically correct sound pressure at the eardrums, more effects play a significant role in the quality of the auditory illusion. In some cases, these can dominate perception and even overcome physical deviations. Perceptual effects like the room-divergence effect, additional visual influences, personalization, pose and position tracking as well as adaptation processes are discussed. These effects are described individually, and the interconnections between them are highlighted. With the results from experiments performed by the authors, the perceptual effects can be quantified. Furthermore, concepts are proposed to optimize reproduction systems with regard to those effects. One example could be a system that adapts to varying listening situations as well as individual listening habits, experience and preference.
Physical and perceptual differences of selected approaches to realize an echolocation scenario in room acoustical auralizations. - In: Proceedings of the International Symposium on Room Acoustics, (2019), S. 237
Perceptual differences of position dependent room acoustics in a small conference room. - In: Proceedings of the International Symposium on Room Acoustics, (2019), S. 499-506
Perceived quality and spatial impression of room reverberation in VR reproduction from measured images and acoustics. - In: Proceedings of the 23rd International Congress on Acoustics, (2019), S. 3361-3368
Perceptual aspects in spatial audio processing. - In: Proceedings of the 23rd International Congress on Acoustics, (2019), S. 3354-3360
Spatial audio processing includes recording, modification and rendering of multichannel audio. In all these fields there is the choice of either a physical representation or of perceptual approaches trying to achieve a target perceived audio quality. Classical microphone techniques on one hand and wave field synthesis, higher order ambisonics or certain methods of binaural rendering for headphone reproduction on the other hand target a good physical representation of sound. As it is known today, especially in the case of sound reproduction a faithful physical recreation of the sound wave forms ("correct signal at the ear drums") is neither necessary nor does it allow a fully authentic or even plausible reproduction of sound. 20 years ago, MPEG-4 standardized different modes for perception based versus physics based reproduction (called "Perceptual approach to modify natural source" and "Acoustic properties for physical based audio rendering"). In spatial rendering today, more and more the perceptual approach is used in state of the art systems. We give some examples of such rendering. The same distinction of physics based versus psychoacoustics (including cognitive effects) based rendering is used today for room simulation or artificial reverb systems. Perceptual aspects are at the heart of audio signal processing today.
Data set: BRIRs for position-dynamic binaural synthesis measured in two rooms. - In: Audio for virtual, augmented and mixed realities, (2019), S. 165-169
Binaural room impulse responses were measured with a KEMAR 45BA head-and-torso-simulator. For the first data set, it was placed at different positions located on a line with a length of 2 m in a 25 cm positional resolution and an azimuth resolution of 4˚. Two source positions were considered in the setup, one in front of the line, one at the side. The same arrangement of source and receiver positions was realized in two different rooms, a quite dry listening laboratory and a quite reverberant seminar room. For the second data set, BRIRs and omni-directional RIRs were measured for a translation line with a length of 7.5 m through the given seminar room. The data sets are valuable for realizing, testing and studying dynamic binaural walk-through scenarios in the two different rooms.