Publications of the Department of Audiovisual Technology

The following list (automatically generated by the University Library) contains the publications from the year 2016. The publications up to the year 2015 can be found on an extra page.

Note: If you want to search through all the publications, select "Show All" and then you can use the browser search with Ctrl+F.

Results: 165
Created on: Wed, 01 May 2024 23:03:12 +0200 in 0.3586 sec


Göring, Steve; Raake, Alexander
Image appeal revisited: analysis, new dataset, and prediction models. - In: IEEE access, ISSN 2169-3536, Bd. 11 (2023), S. 69563-69585

There are more and more photographic images uploaded to social media platforms such as Instagram, Flickr, or Facebook on a daily basis. At the same time, attention and consumption for such images is high, with image views and liking as one of the success factors for users and driving forces for social media algorithms. Here, “liking” can be assumed to be driven by image appeal and further factors such as who is posting the images and what they may show and reveal about the posting person. It is therefore of high research interest to evaluate the appeal of such images in the context of social media platforms. Such an appeal evaluation may help to improve image quality or could be used as an additional filter criterion to select good images. To analyze image appeal, various datasets have been established over the past years. However, not all datasets contain high-resolution images, are up to date, or include additional data, such as meta-data or social-media-type data such as likes and views. We created our own dataset “AVT-ImageAppeal-Dataset”, which includes images from different photo-sharing platforms. The dataset also includes a subset of other state-of-the-art datasets and is extended by social-media-type data, meta-data, and additional images. In this paper, we describe the dataset and a series of laboratory- and crowd-tests we conducted to evaluate image appeal. These tests indicate that there is only a small influence when likes and views are included in the presentation of the images in comparison to when these are not shown, and also the appeal ratings are only a little correlated to likes and views. Furthermore, it is shown that lab and crowd tests are highly similar considering the collected appeal ratings. In addition to the dataset, we also describe various machine learning models for the prediction of image appeal, using only the photo itself as input. The models have a similar or slightly better performance than state-of-the-art models. The evaluation indicates that there is still an improvement in image appeal prediction and furthermore, other aspects, such as the presentation context could be evaluated.



https://doi.org/10.1109/ACCESS.2023.3292588
Melnyk, Sergiy; Zhou, Qiuheng; Schotten, Hans D.; Galkow-Schneider, Mandy; Friese, Ingo; Pfandzelter, Tobias; Bermbach, David; Bassbouss, Louay; Zoubarev, Alexander; Neparidze, Andy; Kritzner, Arndt; Zschau, Enrico; Dhara, Prasenjit; Göring, Steve; Menz, William; Raake, Alexander; Rüther-Kindel, Wolfgang; Quaeck, Fabian; Stuckert, Nick; Vilter, Robert
6G NeXt - toward 6G split computing network applications: use cases and architecture. - In: Mobilkommunikation, (2023), S. 126-131

Göring, Steve; Ramachandra Rao, Rakesh Rao; Raake, Alexander
Quality assessment of higher resolution images and videos with remote testing. - In: Quality and user experience, ISSN 2366-0147, Bd. 8 (2023), 1, 2, S. 1-26

In many research fields, human-annotated data plays an important role as it is used to accomplish a multitude of tasks. One such example is in the field of multimedia quality assessment where subjective annotations can be used to train or evaluate quality prediction models. Lab-based tests could be one approach to get such quality annotations. They are usually performed in well-defined and controlled environments to ensure high reliability. However, this high reliability comes at a cost of higher time consumption and costs incurred. To mitigate this, crowd or online tests could be used. Usually, online tests cover a wider range of end devices, environmental conditions, or participants, which may have an impact on the ratings. To verify whether such online tests can be used for visual quality assessment, we designed three online tests. These online tests are based on previously conducted lab tests as this enables comparison of the results of both test paradigms. Our focus is on the quality assessment of high-resolution images and videos. The online tests use AVrate Voyager, which is a publicly accessible framework for online tests. To transform the lab tests into online tests, dedicated adaptations in the test methodologies are required. The considered modifications are, for example, a patch-based or centre cropping of the images and videos, or a randomly sub-sampling of the to-be-rated stimuli. Based on the analysis of the test results in terms of correlation and SOS analysis it is shown that online tests can be used as a reliable replacement for lab tests albeit with some limitations. These limitations relate to, e.g., lack of appropriate display devices, limitation of web technologies, and modern browsers considering support for different video codecs and formats.



https://doi.org/10.1007/s41233-023-00055-6
Göring, Steve; Ramachandra Rao, Rakesh Rao; Merten, Rasmus; Raake, Alexander
Analysis of appeal for realistic AI-generated photos. - In: IEEE access, ISSN 2169-3536, Bd. 11 (2023), S. 38999-39012

AI-generated images have gained in popularity in recent years due to improvements and developments in the field of artificial intelligence. This has led to several new AI generators, which may produce realistic, funny, and impressive images using a simple text prompt. DALL-E-2, Midjourney, and Craiyon are a few examples of the mentioned approaches. In general, it can be seen that the quality, realism, and appeal of the images vary depending on the used approach. Therefore, in this paper, we analyze to what extent such AI-generated images are realistic or of high appeal from a more photographic point of view and how users perceive them. To evaluate the appeal of several state-of-the-art AI generators, we develop a dataset consisting of 27 different text prompts, with some of them being based on the DrawBench prompts. Using these prompts we generated a total of 135 images with five different AI-Text-To-Image generators. These images in combination with real photos form the basis of our evaluation. The evaluation is based on an online subjective study and the results are compared with state-of-the-art image quality models and features. The results indicate that some of the included generators are able to produce realistic and highly appealing images. However, this depends on the approach and text prompt to a large extent. The dataset and evaluation of this paper are made publicly available for reproducibility, following an Open Science approach.



https://doi.org/10.1109/ACCESS.2023.3267968
Weidner, Florian; Böttcher, Gerd; Arévalo Arboleda, Stephanie; Diao, Chenyao; Sinani, Luljeta; Kunert, Christian; Gerhardt, Christoph; Broll, Wolfgang; Raake, Alexander
A systematic review on the visualization of avatars and agents in AR & VR displayed using head-mounted displays. - In: IEEE transactions on visualization and computer graphics, ISSN 1941-0506, Bd. 29 (2023), 5, S. 2596-2606

Augmented Reality (AR) and Virtual Reality (VR) are pushing from the labs towards consumers, especially with social applications. These applications require visual representations of humans and intelligent entities. However, displaying and animating photo-realistic models comes with a high technical cost while low-fidelity representations may evoke eeriness and overall could degrade an experience. Thus, it is important to carefully select what kind of avatar to display. This article investigates the effects of rendering style and visible body parts in AR and VR by adopting a systematic literature review. We analyzed 72 papers that compare various avatar representations. Our analysis includes an outline of the research published between 2015 and 2022 on the topic of avatars and agents in AR and VR displayed using head-mounted displays, covering aspects like visible body parts (e.g., hands only, hands and head, full-body) and rendering style (e.g., abstract, cartoon, realistic); an overview of collected objective and subjective measures (e.g., task performance, presence, user experience, body ownership); and a classification of tasks where avatars and agents were used into task domains (physical activity, hand interaction, communication, game-like scenarios, and education/training). We discuss and synthesize our results within the context of today's AR and VR ecosystem, provide guidelines for practitioners, and finally identify and present promising research opportunities to encourage future research of avatars and agents in AR/VR environments.



https://doi.org/10.1109/TVCG.2023.3247072
Stoll, Eckhard; Breide, Stephan; Göring, Steve; Raake, Alexander
Modeling of an automatic vision mixer with human characteristics for multi-camera theater recordings. - In: IEEE access, ISSN 2169-3536, Bd. 11 (2023), S. 18714-18726

A production process using high-resolution cameras can be used for multi-camera recordings of theater performances or other stage performances. One approach to automate the generation of suitable image cuts could be to focus on speaker changes so that the person who is speaking is shown in the generated cut. However, these image cuts can appear static and robotic if they are set too precisely. Therefore, the characteristics and habits of professional vision mixers (persons who operate the vision mixing desk) during the editing process are investigated in more detail in order to incorporate them into an automation process. The characteristic features of five different vision mixers are examined, which were used under almost identical recording conditions for theatrical cuts in TV productions. The cuts are examined with regard to their temporal position in relation to pauses in speech, which take place during speaker changes on stage. It is shown that different professional vision mixers set the cuts individually differently before, in or after the pauses in speech. Measured are differences on average up to 0.3 seconds. From the analysis of the image cuts, an approach for a model is developed in which the individual characteristics of a vision mixer can be set. With the help of this novel model, a more human appearance can be given to otherwise exact and robotic cuts, when automating image cuts.



https://doi.org/10.1109/ACCESS.2023.3245804
Leist, Larissa; Reimers, Carolin; Fremerey, Stephan; Fels, Janina; Raake, Alexander; Klatte, Maria
Effects of binaural classroom noise scenarios on primary school children's speech perception and listening comprehension. - In: 51st International Congress and Exposition on Noise Control Engineering (INTER-NOISE 2022), (2023), S. 3214-3220

Singla, Ashutosh;
Assessment of visual quality and simulator sickness for omnidirectional videos. - Ilmenau, 2023. - viii, 186 Seiten
Technische Universität Ilmenau, Dissertation 2022

Ein Anwendungsfall für die aktuelle VR-Technologie mit Head Mounted Displays (HMDs) sind 360˚-Videos. Die valide Bewertung der Erlebnisqualität (Quality of Experience, QoE) für 360˚-Videos erfordert subjektive Tests. Solche Bewertungstests sind zeitaufwändig und erfordern ein gut durchdachtes Protokoll. Internationale Empfehlungen wie z. B. ITU-T Rec. P.910 und ITU-R Rec. BT.500-13 existieren, die Richtlinien für die Bewertung der Videoqualität von 2D-Videos auf 2D-Displays unter Einbeziehung von Testpersonen enthalten. Bis zu dieser Arbeit gab es jedoch keine solche Standardempfehlung für 360˚-Videos. Daher war es notwendig, eine Reihe von Richtlinien zu entwickeln, um die visuelle Qualität und die QoE-Bewertung für 360˚-Videos zu untersuchen. In dieser Arbeit werden umfangreiche Forschungsarbeiten zur Qualität und Lebensqualität von 360˚-Videos vorgestellt, die von Nutzern mit HMDs wahrgenommen werden, sowie eine Reihe von Testprotokollen für eine systematische Bewertung. Zunächst wurden konventionelle subjektive Testmethoden wie das Absolute Category Rating (ACR) und die Double Stimulus Impairment Scale (DSIS) zur Bewertung der Videoqualität eingesetzt, neben der im Rahmen dieser Arbeit neu vorgeschlagenen Modified ACR (M-ACR) Methode. Aufbauend auf der Zuverlässigkeit und allgemeinen Anwendbarkeit des Verfahrens über verschiedene Tests hinweg wird in dieser Arbeit ein methodischer Rahmen für die Bewertung der Qualität von 360˚-Videos vorgestellt. Zweitens bringt ein erhöhter Immersionsgrad bei 360˚-Videos das Problem der Simulatorkrankheit (Simulator Sickness) als weiteren QoE-Bestandteil mit sich. Daher wird in dieser Arbeit die Simulatorkrankheit analysiert, um die Auswirkungen verschiedener Einflussfaktoren zu untersuchen. Die gewonnenen Erkenntnisse zur Simulator Sickness im Zusammenhang mit 360˚-Videos tragen zu einem besseren Verständnis dieses speziellen Anwendungsfalls von VR bei. Darüber hinaus wird ein vereinfachter Fragebogen zur Simulatorkrankheit (SSQ) für die Selbsteinschätzung von Symptomen, die für 360˚-Videos relevant sind, vorgeschlagen, indem verschiedene Versionen von Fragebögen mit den State-of-the-Art-Varianten Cybersickness Questionnaire und Virtual Reality Symptom Questionnaire sowie den bestehenden SSQ-Skalen verglichen werden. Die Ergebnisse zeigen, dass die vereinfachte Version des SSQ sich auf die Symptome konzentriert, die für Studien mit 360˚-Videos relevant sind. Es wird gezeigt, dass er effektiv eingesetzt werden kann, wobei der reduzierte Satz von Skalen eine effizientere und damit umfangreichere Prüfung ermöglicht.



De Moor, Katrien; Fiedler, Markus; Raake, Alexander; Jhunjhunwala, Ashok; Gnanasekaran, Vahiny; Subramanian, Sruti; Zinner, Thomas
Towards the design and evaluation of more sustainable multimedia experiences: which role can QoE research play?. - In: ACM SIGMultimedia records, ISSN 1947-4598, Bd. 14 (2022), 3, 4, S. 1

In this column, we reflect on the environmental impact and broader sustainability implications of resource-demanding digital applications and services such as video streaming, VR/AR/XR and videoconferencing. We put emphasis not only on the experiences and use cases they enable but also on the "cost" of always striving for high Quality of Experience (QoE) and better user experiences. Starting by sketching the broader context, our aim is to raise awareness about the role that QoE research can play in the context of various of the United Nations' Sustainable Development Goals (SDGs), either directly (e.g., SDG 13 "climate action") or more indirectly (e.g., SDG 3 "good health and well-being" and SDG 12 "responsible consumption and production").



https://doi.org/10.1145/3630658.3630662
Reimers, Carolin; Loh, Karin; Leist, Larissa; Fremerey, Stephan; Raake, Alexander; Klatte, Maria; Fels, Janina
Investigating different cueing methods for auditory selective attention in virtual reality. - Berlin : Deutsche Gesellschaft für Akustik e.V.. - 1 Online-Ressource (4 Seiten)Online-Ausgabe: DAGA 2022 : 48. Jahrestagung für Akustik, 21.-24. März 2022, Stuttgart und Online, Seiten/Artikel-Nr: 1173-1176

An audio-only paradigm for investigating auditory selective attention (ASA) has previously been transferred into a classroom-type audio-visual virtual reality (VR) environment. Due to the paradigm structure, the participants were only focusing on a specific area of the VR environment during the entire experiment. In a more realistic scenario, participants are expected to interact with the scene. Therefore, this study investigates new cueing methods that may reduce the focus on one point in the virtual world and allow for further development of a close-to-real life scenario.



https://doi.org/10.18154/RWTH-2022-04388