Veröffentlichungen des Fachgebiets Audiovisuelle Technik

Die folgende Liste (automatisch durch die Universitätsbibliothek erstellt) enthält die Publikationen ab dem Jahr 2016. Die Veröffentlichungen bis zum Jahr 2015 finden sie auf einer extra Seite.

Hinweis: Wenn Sie alle Publikationen durchsuchen wollen, dann wählen Sie "Alle anzeigen" und können dann die Browser-Suche mit Strg+F verwenden.

Anzahl der Treffer: 162
Erstellt: Thu, 25 Apr 2024 23:03:09 +0200 in 0.0756 sec


Göring, Steve; Ramachandra Rao, Rakesh Rao; Merten, Rasmus; Raake, Alexander
Analysis of appeal for realistic AI-generated photos. - In: IEEE access, ISSN 2169-3536, Bd. 11 (2023), S. 38999-39012

AI-generated images have gained in popularity in recent years due to improvements and developments in the field of artificial intelligence. This has led to several new AI generators, which may produce realistic, funny, and impressive images using a simple text prompt. DALL-E-2, Midjourney, and Craiyon are a few examples of the mentioned approaches. In general, it can be seen that the quality, realism, and appeal of the images vary depending on the used approach. Therefore, in this paper, we analyze to what extent such AI-generated images are realistic or of high appeal from a more photographic point of view and how users perceive them. To evaluate the appeal of several state-of-the-art AI generators, we develop a dataset consisting of 27 different text prompts, with some of them being based on the DrawBench prompts. Using these prompts we generated a total of 135 images with five different AI-Text-To-Image generators. These images in combination with real photos form the basis of our evaluation. The evaluation is based on an online subjective study and the results are compared with state-of-the-art image quality models and features. The results indicate that some of the included generators are able to produce realistic and highly appealing images. However, this depends on the approach and text prompt to a large extent. The dataset and evaluation of this paper are made publicly available for reproducibility, following an Open Science approach.



https://doi.org/10.1109/ACCESS.2023.3267968
Weidner, Florian; Böttcher, Gerd; Arévalo Arboleda, Stephanie; Diao, Chenyao; Sinani, Luljeta; Kunert, Christian; Gerhardt, Christoph; Broll, Wolfgang; Raake, Alexander
A systematic review on the visualization of avatars and agents in AR & VR displayed using head-mounted displays. - In: IEEE transactions on visualization and computer graphics, ISSN 1941-0506, Bd. 29 (2023), 5, S. 2596-2606

Augmented Reality (AR) and Virtual Reality (VR) are pushing from the labs towards consumers, especially with social applications. These applications require visual representations of humans and intelligent entities. However, displaying and animating photo-realistic models comes with a high technical cost while low-fidelity representations may evoke eeriness and overall could degrade an experience. Thus, it is important to carefully select what kind of avatar to display. This article investigates the effects of rendering style and visible body parts in AR and VR by adopting a systematic literature review. We analyzed 72 papers that compare various avatar representations. Our analysis includes an outline of the research published between 2015 and 2022 on the topic of avatars and agents in AR and VR displayed using head-mounted displays, covering aspects like visible body parts (e.g., hands only, hands and head, full-body) and rendering style (e.g., abstract, cartoon, realistic); an overview of collected objective and subjective measures (e.g., task performance, presence, user experience, body ownership); and a classification of tasks where avatars and agents were used into task domains (physical activity, hand interaction, communication, game-like scenarios, and education/training). We discuss and synthesize our results within the context of today's AR and VR ecosystem, provide guidelines for practitioners, and finally identify and present promising research opportunities to encourage future research of avatars and agents in AR/VR environments.



https://doi.org/10.1109/TVCG.2023.3247072
Stoll, Eckhard; Breide, Stephan; Göring, Steve; Raake, Alexander
Modeling of an automatic vision mixer with human characteristics for multi-camera theater recordings. - In: IEEE access, ISSN 2169-3536, Bd. 11 (2023), S. 18714-18726

A production process using high-resolution cameras can be used for multi-camera recordings of theater performances or other stage performances. One approach to automate the generation of suitable image cuts could be to focus on speaker changes so that the person who is speaking is shown in the generated cut. However, these image cuts can appear static and robotic if they are set too precisely. Therefore, the characteristics and habits of professional vision mixers (persons who operate the vision mixing desk) during the editing process are investigated in more detail in order to incorporate them into an automation process. The characteristic features of five different vision mixers are examined, which were used under almost identical recording conditions for theatrical cuts in TV productions. The cuts are examined with regard to their temporal position in relation to pauses in speech, which take place during speaker changes on stage. It is shown that different professional vision mixers set the cuts individually differently before, in or after the pauses in speech. Measured are differences on average up to 0.3 seconds. From the analysis of the image cuts, an approach for a model is developed in which the individual characteristics of a vision mixer can be set. With the help of this novel model, a more human appearance can be given to otherwise exact and robotic cuts, when automating image cuts.



https://doi.org/10.1109/ACCESS.2023.3245804
Leist, Larissa; Reimers, Carolin; Fremerey, Stephan; Fels, Janina; Raake, Alexander; Klatte, Maria
Effects of binaural classroom noise scenarios on primary school children's speech perception and listening comprehension. - In: 51st International Congress and Exposition on Noise Control Engineering (INTER-NOISE 2022), (2023), S. 3214-3220

Singla, Ashutosh;
Assessment of visual quality and simulator sickness for omnidirectional videos. - Ilmenau, 2023. - viii, 186 Seiten
Technische Universität Ilmenau, Dissertation 2022

Ein Anwendungsfall für die aktuelle VR-Technologie mit Head Mounted Displays (HMDs) sind 360˚-Videos. Die valide Bewertung der Erlebnisqualität (Quality of Experience, QoE) für 360˚-Videos erfordert subjektive Tests. Solche Bewertungstests sind zeitaufwändig und erfordern ein gut durchdachtes Protokoll. Internationale Empfehlungen wie z. B. ITU-T Rec. P.910 und ITU-R Rec. BT.500-13 existieren, die Richtlinien für die Bewertung der Videoqualität von 2D-Videos auf 2D-Displays unter Einbeziehung von Testpersonen enthalten. Bis zu dieser Arbeit gab es jedoch keine solche Standardempfehlung für 360˚-Videos. Daher war es notwendig, eine Reihe von Richtlinien zu entwickeln, um die visuelle Qualität und die QoE-Bewertung für 360˚-Videos zu untersuchen. In dieser Arbeit werden umfangreiche Forschungsarbeiten zur Qualität und Lebensqualität von 360˚-Videos vorgestellt, die von Nutzern mit HMDs wahrgenommen werden, sowie eine Reihe von Testprotokollen für eine systematische Bewertung. Zunächst wurden konventionelle subjektive Testmethoden wie das Absolute Category Rating (ACR) und die Double Stimulus Impairment Scale (DSIS) zur Bewertung der Videoqualität eingesetzt, neben der im Rahmen dieser Arbeit neu vorgeschlagenen Modified ACR (M-ACR) Methode. Aufbauend auf der Zuverlässigkeit und allgemeinen Anwendbarkeit des Verfahrens über verschiedene Tests hinweg wird in dieser Arbeit ein methodischer Rahmen für die Bewertung der Qualität von 360˚-Videos vorgestellt. Zweitens bringt ein erhöhter Immersionsgrad bei 360˚-Videos das Problem der Simulatorkrankheit (Simulator Sickness) als weiteren QoE-Bestandteil mit sich. Daher wird in dieser Arbeit die Simulatorkrankheit analysiert, um die Auswirkungen verschiedener Einflussfaktoren zu untersuchen. Die gewonnenen Erkenntnisse zur Simulator Sickness im Zusammenhang mit 360˚-Videos tragen zu einem besseren Verständnis dieses speziellen Anwendungsfalls von VR bei. Darüber hinaus wird ein vereinfachter Fragebogen zur Simulatorkrankheit (SSQ) für die Selbsteinschätzung von Symptomen, die für 360˚-Videos relevant sind, vorgeschlagen, indem verschiedene Versionen von Fragebögen mit den State-of-the-Art-Varianten Cybersickness Questionnaire und Virtual Reality Symptom Questionnaire sowie den bestehenden SSQ-Skalen verglichen werden. Die Ergebnisse zeigen, dass die vereinfachte Version des SSQ sich auf die Symptome konzentriert, die für Studien mit 360˚-Videos relevant sind. Es wird gezeigt, dass er effektiv eingesetzt werden kann, wobei der reduzierte Satz von Skalen eine effizientere und damit umfangreichere Prüfung ermöglicht.



De Moor, Katrien; Fiedler, Markus; Raake, Alexander; Jhunjhunwala, Ashok; Gnanasekaran, Vahiny; Subramanian, Sruti; Zinner, Thomas
Towards the design and evaluation of more sustainable multimedia experiences: which role can QoE research play?. - In: ACM SIGMultimedia records, ISSN 1947-4598, Bd. 14 (2022), 3, 4, S. 1

In this column, we reflect on the environmental impact and broader sustainability implications of resource-demanding digital applications and services such as video streaming, VR/AR/XR and videoconferencing. We put emphasis not only on the experiences and use cases they enable but also on the "cost" of always striving for high Quality of Experience (QoE) and better user experiences. Starting by sketching the broader context, our aim is to raise awareness about the role that QoE research can play in the context of various of the United Nations' Sustainable Development Goals (SDGs), either directly (e.g., SDG 13 "climate action") or more indirectly (e.g., SDG 3 "good health and well-being" and SDG 12 "responsible consumption and production").



https://doi.org/10.1145/3630658.3630662
Reimers, Carolin; Loh, Karin; Leist, Larissa; Fremerey, Stephan; Raake, Alexander; Klatte, Maria; Fels, Janina
Investigating different cueing methods for auditory selective attention in virtual reality. - Berlin : Deutsche Gesellschaft für Akustik e.V.. - 1 Online-Ressource (4 Seiten)Online-Ausgabe: DAGA 2022 : 48. Jahrestagung für Akustik, 21.-24. März 2022, Stuttgart und Online, Seiten/Artikel-Nr: 1173-1176

An audio-only paradigm for investigating auditory selective attention (ASA) has previously been transferred into a classroom-type audio-visual virtual reality (VR) environment. Due to the paradigm structure, the participants were only focusing on a specific area of the VR environment during the entire experiment. In a more realistic scenario, participants are expected to interact with the scene. Therefore, this study investigates new cueing methods that may reduce the focus on one point in the virtual world and allow for further development of a close-to-real life scenario.



https://doi.org/10.18154/RWTH-2022-04388
Breuer, Carolin; Loh, Karin; Leist, Larissa; Fremerey, Stephan; Raake, Alexander; Klatte, Maria; Fels, Janina
Examining the auditory selective attention switch in a child-suited virtual reality classroom environment. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 24, 16569, S. 1-20

The ability to focus ones attention in different acoustical environments has been thoroughly investigated in the past. However, recent technological advancements have made it possible to perform laboratory experiments in a more realistic manner. In order to investigate close-to-real-life scenarios, a classroom was modeled in virtual reality (VR) and an established paradigm to investigate the auditory selective attention (ASA) switch was translated from an audio-only version into an audiovisual VR setting. The new paradigm was validated with adult participants in a listening experiment, and the results were compared to the previous version. Apart from expected effects such as switching costs and auditory congruency effects, which reflect the robustness of the overall paradigm, a difference in error rates between the audio-only and the VR group was found, suggesting enhanced attention in the new VR setting, which is consistent with recent studies. Overall, the results suggest that the presented VR paradigm can be used and further developed to investigate the voluntary auditory selective attention switch in a close-to-real-life classroom scenario.



https://doi.org/10.3390/ijerph192416569
Leist, Larissa; Breuer, Carolin; Yadav, Manuj; Fremerey, Stephan; Fels, Janina; Raake, Alexander; Lachmann, Thomas; Schlittmeier, Sabine; Klatte, Maria
Differential effects of task-irrelevant monaural and binaural classroom scenarios on children's and adults' speech perception, listening comprehension, and visual-verbal short-term memory. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 23, 15998, S. 1-17

Most studies investigating the effects of environmental noise on children’s cognitive performance examine the impact of monaural noise (i.e., same signal to both ears), oversimplifying multiple aspects of binaural hearing (i.e., adequately reproducing interaural differences and spatial information). In the current study, the effects of a realistic classroom-noise scenario presented either monaurally or binaurally on tasks requiring processing of auditory and visually presented information were analyzed in children and adults. In Experiment 1, across age groups, word identification was more impaired by monaural than by binaural classroom noise, whereas listening comprehension (acting out oral instructions) was equally impaired in both noise conditions. In both tasks, children were more affected than adults. Disturbance ratings were unrelated to the actual performance decrements. Experiment 2 revealed detrimental effects of classroom noise on short-term memory (serial recall of words presented pictorially), which did not differ with age or presentation mode (monaural vs. binaural). The present results add to the evidence for detrimental effects of noise on speech perception and cognitive performance, and their interactions with age, using a realistic classroom-noise scenario. Binaural simulations of real-world auditory environments can improve the external validity of studies on the impact of noise on children’s and adults’ learning.



https://doi.org/10.3390/ijerph192315998
Robotham, Thomas; Singla, Ashutosh; Rummukainen, Olli S.; Raake, Alexander; Habets, Emanuel A.P.
Audiovisual database with 360˚ video and higher-order Ambisonics audio for perception, cognition, behavior, and QoE evaluation research. - In: 2022 14th International Conference on Quality of Multimedia Experience (QoMEX), (2022), insges. 6 S.

Research into multi-modal perception, human cognition, behavior, and attention can benefit from high-fidelity content that may recreate real-life-like scenes when rendered on head-mounted displays. Moreover, aspects of audiovisual perception, cognitive processes, and behavior may complement questionnaire-based Quality of Experience (QoE) evaluation of interactive virtual environments. Currently, there is a lack of high-quality open-source audiovisual databases that can be used to evaluate such aspects or systems capable of reproducing high-quality content. With this paper, we provide a publicly available audiovisual database consisting of twelve scenes capturing real-life nature and urban environments with a video resolution of 7680×3840 at 60 frames-per-second and with 4th-order Ambisonics audio. These 360˚ video sequences, with an average duration of 60 seconds, represent real-life settings for systematically evaluating various dimensions of uni-/multi-modal perception, cognition, behavior, and QoE. The paper provides details of the scene requirements, recording approach, and scene descriptions. The database provides high-quality reference material with a balanced focus on auditory and visual sensory information. The database will be continuously updated with additional scenes and further metadata such as human ratings and saliency information.



https://doi.org/10.1109/QoMEX55416.2022.9900893