Publications of the Department of Audiovisual Technology

The following list (automatically generated by the University Library) contains the publications from the year 2016. The publications up to the year 2015 can be found on an extra page.

Note: If you want to search through all the publications, select "Show All" and then you can use the browser search with Ctrl+F.

Results: 165
Created on: Thu, 02 May 2024 23:03:32 +0200 in 0.0781 sec


Robitza, Werner; Ramachandra Rao, Rakesh Rao; Göring, Steve; Dethof, Alexander; Raake, Alexander
Deploying the ITU-T P.1203 QoE model in the wild and retraining for new codecs. - In: MHV '22, (2022), S. 121-122

This paper presents two challenges associated with using the ITU-T P.1203 standard for video quality monitoring in practice. We discuss the issue of unavailable data on certain browsers/platforms and the lack of information within newly developed data formats like Common Media Client Data. We also re-trained the coefficients of the P.1203.1 video model for newer codecs, and published a completely new model derived from the P.1204.3 bitstream model.



https://doi.org/10.1145/3510450.3517310
Döring, Nicola; De Moor, Katrien; Fiedler, Markus; Schoenenberg, Katrin; Raake, Alexander
Videoconference fatigue: a conceptual analysis. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 4, 2061, S. 1-20

Videoconferencing (VC) is a type of online meeting that allows two or more participants from different locations to engage in live multi-directional audio-visual communication and collaboration (e.g., via screen sharing). The COVID-19 pandemic has induced a boom in both private and professional videoconferencing in the early 2020s that elicited controversial public and academic debates about its pros and cons. One main concern has been the phenomenon of videoconference fatigue. The aim of this conceptual review article is to contribute to the conceptual clarification of VC fatigue. We use the popular and succinct label "Zoom fatigue" interchangeably with the more generic label "videoconference fatigue" and define it as the experience of fatigue during and/or after a videoconference, regardless of the specific VC system used. We followed a structured eight-phase process of conceptual analysis that led to a conceptual model of VC fatigue with four key causal dimensions: (1) personal factors, (2) organizational factors, (3) technological factors, and (4) environmental factors. We present this 4D model describing the respective dimensions with their sub-dimensions based on theories, available evidence, and media coverage. The 4D-model is meant to help researchers advance empirical research on videoconference fatigue.



https://doi.org/10.3390/ijerph19042061
Katsavounidis, Ioannis; Robitza, Werner; Puri, Rohit; Satti, Shahid
VQEG column: new topics. - In: ACM SIGMultimedia records, ISSN 1947-4598, Bd. 13 (2021), 1, 5, S. 1

Welcome to the fourth column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG). During the last VQEG plenary meeting (14-18 Dec. 2020) various interesting discussions arose regarding new topics not addressed up to then by VQEG groups, which led to launching three new sub-projects and a new project related to: 1) clarifying the computation of spatial and temporal information (SI and TI), 2) including video quality metrics as metadata in compressed bitstreams, 3) Quality of Experience (QoE) metrics for live video streaming applications, and 4) providing guidelines on implementing objective video quality metrics to the video compression community. The following sections provide more details about these new activities and try to encourage interested readers to follow and get involved in any of them by subscribing to the corresponding reflectors.



https://doi.org/10.1145/3577934.3577939
Göring, Steve; Raake, Alexander
Rule of thirds and simplicity for image aesthetics using deep neural networks. - In: IEEE 23rd International Workshop on Multimedia Signal Processing, (2021), insges. 6 S.

Considering the increasing amount of photos being uploaded to sharing platforms, a proper evaluation of photo appeal or aesthetics is required. For appealing images several "rules of thumb" have been established, e.g., the rule of thirds and simplicity. We handle rule of thirds and simplicity as binary classification problems with a deep learning based image processing pipeline. Our pipeline uses a pre-processing step, a pre-trained baseline deep neural network (DNN) and post-processing. For each of the rules, we re-train 17 pre-trained DNN models using transfer learning. Our results for publicly available datasets show that the ResNet152 DNN is best for rule of thirds prediction and DenseNet121 is best for simplicity with an accuracy of around 0.84 and 0.94 respectively. In addition to the datasets for both classifications, five experts annotated another dataset with ≈ 1100 images and we evaluate the best performing models. Results show that the best performing models have an accuracy of 0.67 for rule of thirds and 0.79 for image simplicity. Both accuracy results are within the range of pairwise accuracy of expert annotators. However, it further indicates that there is a high subjective influence for both of the considered rules.



https://doi.org/10.1109/MMSP53017.2021.9733554
Göring, Steve; Ramachandra Rao, Rakesh Rao; Fremerey, Stephan; Raake, Alexander
AVrate Voyager: an open source online testing platform. - In: IEEE 23rd International Workshop on Multimedia Signal Processing, (2021), insges. 6 S.

Subjective testing is an integral part of many research fields considering, e.g., human perception. For this purpose, lab tests are a popular approach to gather ratings for subjective evaluations. However, not in all cases controlled lab tests can be performed, either in cases where no labs are existing, accessible or it may be disallowed to use them. For this reason, online tests, e.g., using crowdsourcing are supposed to be an alternative approach for traditional lab tests. We describe in the following paper a framework to implement such online tests for audio, video, and image-related evaluations or questionnaires. Our framework AVrate Voyager builds upon previously developed frameworks for lab tests including the experience with them. AVrate Voyager uses scalable web technologies to implement a test framework, this ensures that it will be running reliably. In addition, we added strategies for pre-caching to avoid additional influence for play-out, e.g. in the case of video testing. We analyze several conducted tests using the new framework and describe the required steps to modify the provided tool in detail.



https://doi.org/10.1109/MMSP53017.2021.9733561
Döring, Nicola; Mikhailova, Veronika; Brandenburg, Karlheinz; Broll, Wolfgang; Groß, Horst-Michael; Werner, Stephan; Raake, Alexander
Saying "Hi" to grandma in nine different ways : established and innovative communication media in the grandparent-grandchild relationship. - In: Technology, Mind, and Behavior, ISSN 2689-0208, (2021), insges. 1 S.

https://doi.org/10.1037/tms0000107
Fremerey, Stephan; Reimers, Carolin; Leist, Larissa; Spilski, Jan; Klatte, Maria; Fels, Janina; Raake, Alexander
Generation of audiovisual immersive virtual environments to evaluate cognitive performance in classroom type scenarios. - In: Tagungsband, DAGA 2021 - 47. Jahrestagung für Akustik, (2021), S. 1336-1339

https://doi.org/10.22032/dbt.50292
Ramachandra Rao, Rakesh Rao; Göring, Steve; Raake, Alexander
Enhancement of pixel-based video quality models using meta-data. - In: Electronic imaging, ISSN 2470-1173, Bd. 33 (2021), 9, art00022, S. 264-1-264-6

Current state-of-the-art pixel-based video quality models for 4K resolution do not have access to explicit meta information such as resolution and framerate and may not include implicit or explicit features that model the related effects on perceived video quality. In this paper, we propose a meta concept to extend state-of-the-art pixel-based models and develop hybrid models incorporating meta-data such as framerate and resolution. Our general approach uses machine learning to incooperate the meta-data to the overall video quality prediction. To this aim, in our study, we evaluate various machine learning approaches such as SVR, random forest, and extreme gradient boosting trees in terms of their suitability for hybrid model development. We use VMAF to demonstrate the validity of the meta-information concept. Our approach was tested on the publicly available AVT-VQDB-UHD-1 dataset. We are able to show an increase in the prediction accuracy for the hybrid models in comparison with the prediction accuracy of the underlying pixel-based model. While the proof-of-concept is applied to VMAF, it can also be used with other pixel-based models.



https://doi.org/10.2352/ISSN.2470-1173.2021.9.IQSP-264
Ho, Man M.; Zhang, Lu; Raake, Alexander; Zhou, Jinjia
Semantic-driven colorization. - In: Proceedings CVMP 2021, (2021), 1, S. 1-10

Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images. Consequently, the generated color is easier to be overflowed, and the semantic faults are invisible. According to human experience in colorization, our brains first detect and recognize the objects in the photo, then imagine their plausible colors based on many similar objects we have seen in real life, and finally colorize them, as described in Figure 1. In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it. Thus, our work can provide plausible colors at a semantic level. Plus, the semantic information predicted from a well-trained model becomes understandable and able to be modified. Additionally, we also prove that Instance Normalization is also a missing ingredient for image colorization, then re-design the inference flow of U-Net to have two streams of data, providing an appropriate way of normalizing the features extracted from the black-and-white image. As a result, our network can provide plausible colors competitive to the typical colorization works for specific objects. Our interactive application is available at https://github.com/minhmanho/semantic-driven_colorization.



https://doi.org/10.1145/3485441.3485645
Keller, Dominik; Seybold, Tamara; Skowronek, Janto; Raake, Alexander
Sensorische Evaluierung in der Kinotechnik : wie Videoqualität mit Methoden aus der Lebensmittelforschung bewertet werden kann. - In: FKT, ISSN 1430-9947, Bd. 75 (2021), 4, S. 33-37