Publications of the Department of Audiovisual Technology

The following list (automatically generated by the University Library) contains the publications from the year 2016. The publications up to the year 2015 can be found on an extra page.

Note: If you want to search through all the publications, select "Show All" and then you can use the browser search with Ctrl+F.

Results: 165
Created on: Thu, 02 May 2024 23:03:32 +0200 in 0.0825 sec


Breuer, Carolin; Loh, Karin; Leist, Larissa; Fremerey, Stephan; Raake, Alexander; Klatte, Maria; Fels, Janina
Examining the auditory selective attention switch in a child-suited virtual reality classroom environment. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 24, 16569, S. 1-20

The ability to focus ones attention in different acoustical environments has been thoroughly investigated in the past. However, recent technological advancements have made it possible to perform laboratory experiments in a more realistic manner. In order to investigate close-to-real-life scenarios, a classroom was modeled in virtual reality (VR) and an established paradigm to investigate the auditory selective attention (ASA) switch was translated from an audio-only version into an audiovisual VR setting. The new paradigm was validated with adult participants in a listening experiment, and the results were compared to the previous version. Apart from expected effects such as switching costs and auditory congruency effects, which reflect the robustness of the overall paradigm, a difference in error rates between the audio-only and the VR group was found, suggesting enhanced attention in the new VR setting, which is consistent with recent studies. Overall, the results suggest that the presented VR paradigm can be used and further developed to investigate the voluntary auditory selective attention switch in a close-to-real-life classroom scenario.



https://doi.org/10.3390/ijerph192416569
Leist, Larissa; Breuer, Carolin; Yadav, Manuj; Fremerey, Stephan; Fels, Janina; Raake, Alexander; Lachmann, Thomas; Schlittmeier, Sabine; Klatte, Maria
Differential effects of task-irrelevant monaural and binaural classroom scenarios on children's and adults' speech perception, listening comprehension, and visual-verbal short-term memory. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 23, 15998, S. 1-17

Most studies investigating the effects of environmental noise on children’s cognitive performance examine the impact of monaural noise (i.e., same signal to both ears), oversimplifying multiple aspects of binaural hearing (i.e., adequately reproducing interaural differences and spatial information). In the current study, the effects of a realistic classroom-noise scenario presented either monaurally or binaurally on tasks requiring processing of auditory and visually presented information were analyzed in children and adults. In Experiment 1, across age groups, word identification was more impaired by monaural than by binaural classroom noise, whereas listening comprehension (acting out oral instructions) was equally impaired in both noise conditions. In both tasks, children were more affected than adults. Disturbance ratings were unrelated to the actual performance decrements. Experiment 2 revealed detrimental effects of classroom noise on short-term memory (serial recall of words presented pictorially), which did not differ with age or presentation mode (monaural vs. binaural). The present results add to the evidence for detrimental effects of noise on speech perception and cognitive performance, and their interactions with age, using a realistic classroom-noise scenario. Binaural simulations of real-world auditory environments can improve the external validity of studies on the impact of noise on children’s and adults’ learning.



https://doi.org/10.3390/ijerph192315998
Robotham, Thomas; Singla, Ashutosh; Rummukainen, Olli S.; Raake, Alexander; Habets, Emanuel A.P.
Audiovisual database with 360˚ video and higher-order Ambisonics audio for perception, cognition, behavior, and QoE evaluation research. - In: 2022 14th International Conference on Quality of Multimedia Experience (QoMEX), (2022), insges. 6 S.

Research into multi-modal perception, human cognition, behavior, and attention can benefit from high-fidelity content that may recreate real-life-like scenes when rendered on head-mounted displays. Moreover, aspects of audiovisual perception, cognitive processes, and behavior may complement questionnaire-based Quality of Experience (QoE) evaluation of interactive virtual environments. Currently, there is a lack of high-quality open-source audiovisual databases that can be used to evaluate such aspects or systems capable of reproducing high-quality content. With this paper, we provide a publicly available audiovisual database consisting of twelve scenes capturing real-life nature and urban environments with a video resolution of 7680×3840 at 60 frames-per-second and with 4th-order Ambisonics audio. These 360˚ video sequences, with an average duration of 60 seconds, represent real-life settings for systematically evaluating various dimensions of uni-/multi-modal perception, cognition, behavior, and QoE. The paper provides details of the scene requirements, recording approach, and scene descriptions. The database provides high-quality reference material with a balanced focus on auditory and visual sensory information. The database will be continuously updated with additional scenes and further metadata such as human ratings and saliency information.



https://doi.org/10.1109/QoMEX55416.2022.9900893
Herglotz, Christian; Robitza, Werner; Kränzler, Matthias; Kaup, André; Raake, Alexander
Modeling of energy consumption and streaming video QoE using a crowdsourcing dataset. - In: 2022 14th International Conference on Quality of Multimedia Experience (QoMEX), (2022), insges. 6 S.

In the past decade, we have witnessed an enormous growth in the demand for online video services. Recent studies estimate that nowadays, more than 1% of the global greenhouse gas emissions can be attributed to the production and use of devices performing online video tasks. As such, research on the true power consumption of devices and their energy efficiency during video streaming is highly important for a sustainable use of this technology. At the same time, over-the-top providers strive to offer high-quality streaming experiences to satisfy user expectations. Here, energy consumption and QoE partly depend on the same system parameters. Hence, a joint view is needed for their evaluation. In this paper, we perform a first analysis of both end-user power efficiency and Quality of Experience of a video streaming service. We take a crowdsourced dataset comprising 447,000 streaming events from YouTube and estimate both the power consumption and perceived quality. The power consumption is modeled based on previous work which we extended towards predicting the power usage of different devices and codecs. The user-perceived QoE is estimated using a standardized model. Our results indicate that an intelligent choice of streaming parameters can optimize both the QoE and the power efficiency of the end user device. Further, the paper discusses limitations of the approach and identifies directions for future research.



https://doi.org/10.1109/QoMEX55416.2022.9900886
Döring, Nicola; Conde, Melisa; Brandenburg, Karlheinz; Broll, Wolfgang; Groß, Horst-Michael; Werner, Stephan; Raake, Alexander
Can communication technologies reduce loneliness and social isolation in older people? : a scoping review of reviews. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 18, 11310, S. 1-20

Background: Loneliness and social isolation in older age are considered major public health concerns and research on technology-based solutions is growing rapidly. This scoping review of reviews aims to summarize the communication technologies (CTs) (review question RQ1), theoretical frameworks (RQ2), study designs (RQ3), and positive effects of technology use (RQ4) present in the research field. Methods: A comprehensive multi-disciplinary, multi-database literature search was conducted. Identified reviews were analyzed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) framework. A total of N = 28 research reviews that cover 248 primary studies spanning 50 years were included. Results: The majority of the included reviews addressed general internet and computer use (82% each) (RQ1). Of the 28 reviews, only one (4%) worked with a theoretical framework (RQ2) and 26 (93%) covered primary studies with quantitative-experimental designs (RQ3). The positive effects of technology use were shown in 55% of the outcome measures for loneliness and 44% of the outcome measures for social isolation (RQ4). Conclusion: While research reviews show that CTs can reduce loneliness and social isolation in older people, causal evidence is limited and insights on innovative technologies such as augmented reality systems are scarce.



https://doi.org/10.3390/ijerph191811310
Ramachandra Rao, Rakesh Rao; Göring, Steve; Raake, Alexander
AVQBits-adaptive video quality model based on bitstream information for various video applications. - In: IEEE access, ISSN 2169-3536, Bd. 10 (2022), S. 80321-80351

The paper presents AVQBits, a versatile, bitstream-based video quality model. It can be applied in several contexts such as video service monitoring, evaluation of video encoding quality, of gaming video QoE, and even of omnidirectional video quality. In the paper, it is shown that AVQBits predictions closely match video quality ratings obained in various subjective tests with human viewers, for videos up to 4K-UHD resolution (Ultra-High Definition, 3840 x 2180 pixels) and framerates up 120 fps. With the different variants of AVQBits presented in the paper, video quality can be monitored either at the client side, in the network or directly after encoding. The no-reference AVQBits model was developed for different video services and types of input data, reflecting the increasing popularity of Video-on-Demand services and widespread use of HTTP-based adaptive streaming. At its core, AVQBits encompasses the standardized ITU-T P.1204.3 model, with further model instances that can either have restricted or extended input information, depending on the application context. Four different instances of AVQBits are presented, that is, a Mode 3 model with full access to the bitstream, a Mode 0 variant using only metadata such as codec type, framerate, resoution and bitrate as input, a Mode 1 model using Mode 0 information and frame-type and -size information, and a Hybrid Mode 0 model that is based on Mode 0 metadata and the decoded video pixel information. The models are trained on the authors’ own AVT-PNATS-UHD-1 dataset described in the paper. All models show a highly competitive performance by using AVT-VQDB-UHD-1 as validation dataset, e.g., with the Mode 0 variant yielding a value of 0.890 Pearson Correlation, the Mode 1 model of 0.901, the hybrid no-reference mode 0 model of 0.928 and the model with full bitstream access of 0.942. In addition, all four AVQBits variants are evaluated when applying them out-of-the-box to different media formats such as 360˚ video, high framerate (HFR) content, or gaming videos. The analysis shows that the ITU-T P.1204.3 and Hybrid Mode 0 instances of AVQBits for the considered use-cases either perform on par with or better than even state-of-the-art full reference, pixel-based models. Furthermore, it is shown that the proposed Mode 0 and Mode 1 variants outperform commonly used no-reference models for the different application scopes. Also, a long-term integration model based on the standardized ITU-T P.1203.3 is presented to estimate ratings of overall audiovisual streaming Quality of Experience (QoE) for sessions of 30 s up to 5 min duration. In the paper, the AVQBits instances with their per-1-sec score output are evaluated as the video quality component of the proposed long-term integration model. All AVQBits variants as well as the long-term integration module are made publicly available for the community for further research.



https://doi.org/10.1109/ACCESS.2022.3195527
Bajpai, Vaibhav; Hohlfeld, Oliver; Crowcroft, Jon; Keshav, Srinivasan; Schulzrinne, Henning; Ott, Jörg; Ferlin, Simone; Carle, Georg; Hines, Andrew; Raake, Alexander
Recommendations for designing hybrid conferences. - In: ACM SIGCOMM computer communication review, ISSN 0146-4833, Bd. 52 (2022), 2, S. 63-69

During the COVID-19 pandemic, many smaller conferences have moved entirely online and larger ones are being held as hybrid events. Even beyond the pandemic, hybrid events reduce the carbon footprint of conference travel and makes events more accessible to parts of the research community that have difficulty traveling long distances, while preserving most advantages of in-person gatherings. While we have developed a solid understanding of how to design virtual events over the last two years, we are still learning how to properly run hybrid events. We present guidelines and considerations-spanning technology, organization and social factors-for organizing successful hybrid conferences. This paper summarizes and extends the discussions held at the Dagstuhl seminar on "Climate Friendly Internet Research" held in July 2021.



https://doi.org/10.1145/3544912.3544920
Gutiérrez, Jesús; Pérez, Pablo; Orduna, Marta; Singla, Ashutosh; Cortés, Carlos; Mazumdar, Pramit; Viola, Irene; Brunnström, Kjell; Battisti, Federica; Ciepliânska, Natalia; Juszka, Dawid; Janowski, Lucjan; Leszczuk, Mikołaj; Adeyemi-Ejeye, Anthony; Hu, Yaosi; Chen, Zhenzhong; Wallendael, Glenn Van; Lambert, Peter; Díaz, César; Hedlund, John; Hamsis, Omar; Fremerey, Stephan; Hofmeyer, Frank; Raake, Alexander; César, Pablo; Carli, Marco; García, Narciso
Subjective evaluation of visual quality and simulator sickness of short 360˚ videos: ITU-T Rec. P.919. - In: IEEE transactions on multimedia, Bd. 24 (2022), S. 3087-3100

Recently an impressive development in immersive technologies, such as Augmented Reality (AR), Virtual Reality (VR) and 360˚ video, has been witnessed. However, methods for quality assessment have not been keeping up. This paper studies quality assessment of 360˚ video from the cross-lab tests (involving ten laboratories and more than 300 participants) carried out by the Immersive Media Group (IMG) of the Video Quality Experts Group (VQEG). These tests were addressed to assess and validate subjective evaluation methodologies for 360˚ video. Audiovisual quality, simulator sickness symptoms, and exploration behavior were evaluated with short (from 10 seconds to 30 seconds) 360˚ sequences. The following factors’ influences were also analyzed: assessment methodology, sequence duration, Head-Mounted Display (HMD) device, uniform and non-uniform coding degradations, and simulator sickness assessment methods. The obtained results have demonstrated the validity of Absolute Category Rating (ACR) and Degradation Category Rating (DCR) for subjective tests with 360˚ videos, the possibility of using 10-second videos (with or without audio) when addressing quality evaluation of coding artifacts, as well as any commercial HMD (satisfying minimum requirements). Also, more efficient methods than the long Simulator Sickness Questionnaire (SSQ) have been proposed to evaluate related symptoms with 360˚ videos. These results have been instrumental for the development of the ITU-T Recommendation P.919. Finally, the annotated dataset from the tests is made publicly available for the research community.



https://doi.org/10.1109/TMM.2021.3093717
Göring, Steve;
Data-driven visual quality estimation using machine learning. - Ilmenau : Universitätsbibliothek, 2022. - 1 Online-Ressource (vi, 190 Seiten)
Technische Universität Ilmenau, Dissertation 2022

Heutzutage werden viele visuelle Inhalte erstellt und sind zugänglich, was auf Verbesserungen der Technologie wie Smartphones und das Internet zurückzuführen ist. Es ist daher notwendig, die von den Nutzern wahrgenommene Qualität zu bewerten, um das Erlebnis weiter zu verbessern. Allerdings sind nur wenige der aktuellen Qualitätsmodelle speziell für höhere Auflösungen konzipiert, sagen mehr als nur den Mean Opinion Score vorher oder nutzen maschinelles Lernen. Ein Ziel dieser Arbeit ist es, solche maschinellen Modelle für höhere Auflösungen mit verschiedenen Datensätzen zu trainieren und zu evaluieren. Als Erstes wird eine objektive Analyse der Bildqualität bei höheren Auflösungen durchgeführt. Die Bilder wurden mit Video-Encodern komprimiert, hierbei weist AV1 die beste Qualität und Kompression auf. Anschließend werden die Ergebnisse eines Crowd-Sourcing-Tests mit einem Labortest bezüglich Bildqualität verglichen. Weiterhin werden auf Deep Learning basierende Modelle für die Vorhersage von Bild- und Videoqualität beschrieben. Das auf Deep Learning basierende Modell ist aufgrund der benötigten Ressourcen für die Vorhersage der Videoqualität in der Praxis nicht anwendbar. Aus diesem Grund werden pixelbasierte Videoqualitätsmodelle vorgeschlagen und ausgewertet, die aussagekräftige Features verwenden, welche Bild- und Bewegungsaspekte abdecken. Diese Modelle können zur Vorhersage von Mean Opinion Scores für Videos oder sogar für anderer Werte im Zusammenhang mit der Videoqualität verwendet werden, wie z.B. einer Bewertungsverteilung. Die vorgestellte Modellarchitektur kann auf andere Videoprobleme angewandt werden, wie z.B. Videoklassifizierung, Vorhersage der Qualität von Spielevideos, Klassifikation von Spielegenres oder der Klassifikation von Kodierungsparametern. Ein wichtiger Aspekt ist auch die Verarbeitungszeit solcher Modelle. Daher wird ein allgemeiner Ansatz zur Beschleunigung von State-of-the-Art-Videoqualitätsmodellen vorgestellt, der zeigt, dass ein erheblicher Teil der Verarbeitungszeit eingespart werden kann, während eine ähnliche Vorhersagegenauigkeit erhalten bleibt. Die Modelle sind als Open Source veröffentlicht, so dass die entwickelten Frameworks für weitere Forschungsarbeiten genutzt werden können. Außerdem können die vorgestellten Ansätze als Bausteine für neuere Medienformate verwendet werden.



https://doi.org/10.22032/dbt.52210
Skowronek, Janto; Raake, Alexander; Berndtsson, Gunilla H.; Rummukainen, Olli S.; Usai, Paolino; Gunkel, Simon N. B.; Johanson, Mathias; Habets, Emanuel A.P.; Malfait, Ludovic; Lindero, David; Toet, Alexander
Quality of experience in telemeetings and videoconferencing: a comprehensive survey. - In: IEEE access, ISSN 2169-3536, Bd. 10 (2022), S. 63885-63931

Telemeetings such as audiovisual conferences or virtual meetings play an increasingly important role in our professional and private lives. For that reason, system developers and service providers will strive for an optimal experience for the user, while at the same time optimizing technical and financial resources. This leads to the discipline of Quality of Experience (QoE), an active field originating from the telecommunication and multimedia engineering domains, that strives for understanding, measuring, and designing the quality experience with multimedia technology. This paper provides the reader with an entry point to the large and still growing field of QoE of telemeetings, by taking a holistic perspective, considering both technical and non-technical aspects, and by focusing on current and near-future services. Addressing both researchers and practitioners, the paper first provides a comprehensive survey of factors and processes that contribute to the QoE of telemeetings, followed by an overview of relevant state-of-the-art methods for QoE assessment. To embed this knowledge into recent technology developments, the paper continues with an overview of current trends, focusing on the field of eXtended Reality (XR) applications for communication purposes. Given the complexity of telemeeting QoE and the current trends, new challenges for a QoE assessment of telemeetings are identified. To overcome these challenges, the paper presents a novel Profile Template for characterizing telemeetings from the holistic perspective endorsed in this paper.



https://doi.org/10.1109/ACCESS.2022.3176369