Publications in the field

Below you will find an automated compilation of the publications of the group. For publications of the individual members of staff, please refer to their personal pages.

List of publications

Anzahl der Treffer: 286
Erstellt: Sat, 18 May 2024 23:02:49 +0200 in 0.1020 sec


Arévalo Arboleda, Stephanie; Kunert, Christian; Hartbrich, Jakob; Schneiderwind, Christian; Diao, Chenyao; Gerhardt, Christoph; Surdu, Tatiana; Weidner, Florian; Broll, Wolfgang; Stephan, Werner; Raake, Alexander
Beyond looks: a study on agent movement and audiovisual spatial coherence in augmented reality. - In: IEEE Xplore digital library, ISSN 2473-2001, (2024), S. 502-512

The appearance of virtual humans (avatars and agents) has been widely explored in immersive environments. However, virtual humans’ movements and associated sounds in real-world interactions, particularly in Augmented Reality (AR), are yet to be explored. In this paper, we investigate the influence of three distinct movement patterns (circle, side-to-side, and standing), two rendering styles (realistic and cartoon), and two types of audio (spatial audio and non-spatial audio) on emotional responses, social presence, appearance and behavior plausibility, audiovisual coherence, and auditory plausibility. To enable that, we conducted a study (N=36) where participants observed an agent reciting a short fictional story. Our results indicate an effect of the rendering style and the type of movement on the subjective perception of the agents behaving in an AR environment. Participants reported higher levels of excitement when they observed the realistic agent moving in a circle compared to the cartoon agent or the other two movement patterns. Moreover, we found an influence of agent’s movement pattern on social presence and higher appearance and behavior plausibility for the realistic rendering style. Regarding audiovisual spatial coherence, we found an influence of rendering style and type of audio only for the cartoon agent. Additionally, the spatial audio was perceived as more plausible than non-spatial audio. Our findings suggest that aligning realistic rendering styles with realistic auditory experiences may not be necessary for 1-1 listening experiences with moving sources. However, movement patterns of agents influence excitement and social presence in passive unidirectional communication scenarios.



https://doi.org/10.1109/VR58804.2024.00071
Döring, Nicola; Mikhailova, Veronika; Brandenburg, Karlheinz; Broll, Wolfgang; Groß, Horst-Michael; Werner, Stephan; Raake, Alexander
Digital media in intergenerational communication: status quo and future scenarios for the grandparent-grandchild relationship. - In: Universal access in the information society, ISSN 1615-5297, Bd. 23 (2024), 1, S. 379-394

Communication technologies play an important role in maintaining the grandparent-grandchild (GP-GC) relationship. Based on Media Richness Theory, this study investigates the frequency of use (RQ1) and perceived quality (RQ2) of established media as well as the potential use of selected innovative media (RQ3) in GP-GC relationships with a particular focus on digital media. A cross-sectional online survey and vignette experiment were conducted in February 2021 among N = 286 university students in Germany (mean age 23 years, 57% female) who reported on the direct and mediated communication with their grandparents. In addition to face-to-face interactions, non-digital and digital established media (such as telephone, texting, video conferencing) and innovative digital media, namely augmented reality (AR)-based and social robot-based communication technologies, were covered. Face-to-face and phone communication occurred most frequently in GP-GC relationships: 85% of participants reported them taking place at least a few times per year (RQ1). Non-digital established media were associated with higher perceived communication quality than digital established media (RQ2). Innovative digital media received less favorable quality evaluations than established media. Participants expressed doubts regarding the technology competence of their grandparents, but still met innovative media with high expectations regarding improved communication quality (RQ3). Richer media, such as video conferencing or AR, do not automatically lead to better perceived communication quality, while leaner media, such as letters or text messages, can provide rich communication experiences. More research is needed to fully understand and systematically improve the utility, usability, and joy of use of different digital communication technologies employed in GP-GC relationships.



https://doi.org/10.1007/s10209-022-00957-w
Neidhardt, Annika;
Data set and physical analysis: BRIRs and SRIRs for walking toward, past and behind virtual loudspeakers in two rooms. - In: AES Europe 2023, (2023), S. 677

To investigate the perceptual effects of simplified acoustic room representations in position-dynamic binaural synthesis, a set of acoustic impulse responses has been measured in a relatively dry listening laboratory and a considerably more reverberant seminar room of similar size. The same arrangement of nine listening positions in equal distances of 25cm, forming a 2m-line for listener translation, and four different source constellations was realized in both rooms, allowing for comparison. A loudspeaker was placed in front and at the side of the translation line, facing toward it and turned by 180˚, facing away from the line. Binaural room impulse responses (BRIRs) were measured with a Kemar 45b head-and-torso-simulator for each of the source-receiver constellations for a full 360˚ rotation with an azimuth resolution of 4˚. This new data set revises and extends a previously published data set by repeating the previous measurements, additionally considering listening positions behind the directional sound sources and providing spatial room impulse responses (SRIRs) to allow for detailed physical analysis of the local physical properties at each of the listening positions for each of the source constellations. The corresponding microphone array consists of one omni-directional measurement microphone in the center and six satellite mircophones arranged on a sphere around it. This paper documents the measurement process, presents the results of the physical analysis and discusses them in relation to perceptual effects observed in previous psychoacoustic studies.



Klein, Florian; Treybig, Lukas; Schneiderwind, Christian; Werner, Stephan; Sporer, Thomas
Just noticeable reverberation difference at varying loudness levels. - In: AES Europe 2023, (2023), S. 361-368

In order to successfully fuse virtual sound sources with the real acoustic environment, the acoustic properties of the real environment must be estimated and utilized for the synthesis of virtual sound sources. Often, just noticeable differences (JNDs) of room acoustic parameters are utilized to predict a good match between virtual and real acoustics. However, several studies in this domain have shown that existing JND values of room acoustic parameters are often not able to predict the perception of the listeners. This can have various reasons: Differences in first reflection patterns are barely measurable with classical acoustic parameters; Even if acoustic differences are above the JND, a plausible reproduction might still be possible; JNDs depend on various factors (such as sound signal, etc.) and existing studies do not cover all of them. The last factor is addressed in this research paper. A three-alternative forced (3AFC) choice test was conducted at four different loudness levels (75 dB(A), 65 dB(A), 55 dB(A), and 45 dB(A)) in a reverberation time range from 0.5 s to 0.8 s. A dependency of the loudness on the detectability of reverberation differences was found for the randomly interleaved presentation of loudness levels but not for sequential presentation. Individual hearing thresholds as well as expertise level significantly influence the JND of reverberation time.



Treybig, Lukas; Werner, Stephan; Klein, Florian; Amengual Garí, Sebastià V.
Robust reverberation time estimation for audio augmented reality applications. - In: AES Europe 2023, (2023), S. 47-55

The paper presents an alternative approach for estimating reverberation time from measurements in real rooms when the requirements of the standard DIN EN ISO 3382-1/2 for the characteristics of the sound source, receiver, and measurement positions cannot be met. The main goal is to minimize the variance of the calculated reverberation times when using a directional source and receiver, or source-receiver relative positions with very small distances. For this purpose, the energy decay curve for individual octave bands is sampled in time. The estimation starts 2 ms after the direct sound. This is followed by several estimates of the RT over a 20 dB drop, starting 1 dB later with each iteration. The best fit mean of these values gives the estimated reverberation time. A comparison with the standard reverberation time estimation shows a variance reduction of 10% to 30% for binaural room impulse responses (BRIRs). The proposed method finds its application in situations where measurements can only be made at a few positions in the room and/or only in a few areas of the room. Furthermore, the method should be better suitable for measurements with receivers located near or at the head of a person.



Fischedick, Söhnke B.; Richter, Kay; Wengefeld, Tim; Seichter, Daniel; Scheidig, Andrea; Döring, Nicola; Broll, Wolfgang; Werner, Stephan; Raake, Alexander; Groß, Horst-Michael
Bridging distance with a collaborative telepresence robot for older adults - report on progress in the CO-HUMANICS project. - In: ISR Europe 2023: 56th International Symposium on Robotics, (2023), S. 346-353

In an aging society, the social needs of older adults, such as regular interactions and independent living, are crucial for their quality of life. However, due to spatial separation from their family and friends, it is difficult to maintain social relationships. Our multidisciplinary project, CO-HUMANICS, aims to meet these needs, even over long distances, through the utilization of innovative technologies, including a robot-based system. This paper presents the first prototype of our system, designed to connect family members or friends virtually present through a mobile robot with an older adult. The system incorporates bi-directional video telephony, remote control capabilities, and enhanced visualization methods. A comparison is made with other state-of-the-art robotic approaches, focusing on remote control capabilities. We provide details about the hard- and software components, e.g., a projector-based pointing unit for collaborative telepresence to assist in everyday tasks. Our comprehensive scene representation is discussed, which utilizes 3D NDT maps, enabling advanced remote navigation features, such as autonomously driving to a specific object. Finally, insights about past and concepts for future evaluation are provided to assess the developed system.



https://ieeexplore.ieee.org/document/10363093
Burnett, Benjamin; Neidhardt, Annika; Cvetkoviâc, Zoran; Hacıhabibo&bovko;glu, Hüseyin; De Sena, Enzo
User expectation of room acoustic parameters in virtual reality environments. - In: 2023 Immersive and 3D Audio: from Architecture to Automotive (I3DA), (2023), insges. 10 S.

This paper explores how visual attributes of a VR scene affect user expectations of room reverberation. A psychoacoustic experiment was run wherein subjects wore a VR headset and adjusted two unlabelled sliders controlling the reverberation time (T60) and the acoustic room size until the reverberant response was closest to their expectation of how the room they were seeing should sound. Different visual characteristics, in particular, room type and size, surface material, and furnishing were modified to determine how these might affect their expectations of the reverberant response. Results showed that visual room size had a significant effect on both the expected T60, in agreement with previous literature, and on the expected acoustic room size. Both relations seem to be well-described by a simple sublinear power law model, which could be used, for instance, to design reverberation time (T60) and acoustic room size values that align well with listeners’ expectation for a given visual room volume. Differences in visual surface materials were found to have a statistically significant effect on the expected T60. The level of visual furnishing, on the other hand, only had a marginally significant effect on the expected T60. The results also indicate considerable subjective differences in individual expectations.



https://doi.org/10.1109/I3DA57090.2023.10289314
Schneiderwind, Christian; Richter, Maike; Merten, Nils; Neidhardt, Annika
Effects of modified late reverberation on audio-visual plausibility and externalization in AR. - In: 2023 Immersive and 3D Audio: from Architecture to Automotive (I3DA), (2023), insges. 9 S.

Binaural synthesis systems can create virtual sound sources that are indistinguishable from reality. In Augmented Reality (AR) applications, virtual sound sources need to blend in with the real environment to create plausible illusions. However, in some scenarios, it may be desirable to enhance the natural acoustic properties of the virtual content to improve speech intelligibility, alleviate listener fatigue, or achieve a specific artistic effect. Previous research has shown that deviating from the original room acoustics can degrade the quality of the auditory illusion, often referred to as the room divergence effect. This study investigates whether it is possible to modify the auditory aesthetics of a room environment without compromising the plausibility of a sound event in AR. To accomplish this, the length of the reverberation tails of measured binaural room impulse responses are modified after the mixing time to change reverberance.A listening test was conducted to evaluate the externalization and audio-visual plausibility of an exemplary AR scene for different degrees of reverberation modification. The results indicate that externalization is unaffected even with extreme modifications (such as a stretch ratio of 1.8). However, audio-visual plausibility is only maintained for moderate modifications (such as stretch ratios of 0.8 and 1.2).



https://doi.org/10.1109/I3DA57090.2023.10289186
Neidhardt, Annika;
Localizability of the closest wall with a speaking avatar at increasing distances in three rooms. - In: 2023 Immersive and 3D Audio: from Architecture to Automotive (I3DA), (2023), insges. 10 S.

The presented study examines the maximum distance at which listeners can still localize the direction of a nearby wall if the own mouth is the sound source. For this investigation, oral binaural room impulse responses (OBRIRs) were measured with a KEMAR dummyhead with mouth simulator at eight different distances to a wall in an anechoic chamber and two rooms with different reverberation properties. Using a headphone-based dynamic auralization, the participants had to turn until they thought to be facing the wall. In a stair-case inspired procedure, the test always started with the shortest distance of 25 cm. In case of a successful localization at least twice in three trials, the distance could be increased in intervals of 25 cm up to about 2 m. The results exhibit considerable differences in the individual performances, which is in line with results of earlier studies. At a 25 cm-distance, all participants could localize the direction of the reflecting wall. From 50 cm onward, more and more participants found it difficult to determine the correct direction. In the anechoic room, four of the 22 participants succeeded in the localization at the 2 m distance. In the reverberant rooms, the localizability decreased significantly.



https://doi.org/10.1109/I3DA57090.2023.10289620
Stolz, Georg; Klein, Florian; Werner, Stephan; Treybig, Lukas; Bley, Andreas; Martin, Christian
Discussion of acoustic and perceptual optimization methods for measuring spatial room impulse responses with a mobile robotic platform. - In: 2023 Immersive and 3D Audio: from Architecture to Automotive (I3DA), (2023), insges. 7 S.

In the field of Auditory Augmented Reality (AAR), one aim is to provide a listening experience that is as close as possible to a real scenario. Measured Spatial Room Impulse Responses (SRIRs) describe the acoustics of a room and can serve as a reference for acoustic simulations or parametrization of room acoustics. In previous works, a measurement system for SRIRs using a mobile robotic platform was introduced. The system consists of a commercially available self-driving platform on which a microphone array is mounted, while the sound sources are distributed at fixed positions in the room. The system is able to conduct high spatial resolution measurements of SRIRs in a uniform grid. In applications where time is limited and/or the area to discover is large, however, a high-resolution measurement is not always feasible.Therefore, the goal of this contribution is to compare different approaches for optimizing the measurement grid. One approach is to use mathematical optimization on acoustic parameters derived from a small set of initial measurements to determine new measurement positions in a iterative manner. Another approach is to optimize the measurement grid in respect to human auditory perception, incorporating e.g. just-noticeable differences of distance and localization perception.The results show that both approaches can achieve significant reductions in the number of measurements required for a adequate acoustic spatial reproduction, with different trade-offs depending on the application scenario and the available prior information.



https://doi.org/10.1109/I3DA57090.2023.10289338