Veröffentlichungen des Fachgebiets Audiovisuelle Technik

Die folgende Liste (automatisch durch die Universitätsbibliothek erstellt) enthält die Publikationen ab dem Jahr 2016. Die Veröffentlichungen bis zum Jahr 2015 finden sie auf einer extra Seite.

Hinweis: Wenn Sie alle Publikationen durchsuchen wollen, dann wählen Sie "Alle anzeigen" und können dann die Browser-Suche mit Strg+F verwenden.

Anzahl der Treffer: 170
Erstellt: Tue, 14 May 2024 23:02:16 +0200 in 0.1228 sec


Döring, Nicola; Mikhailova, Veronika; Brandenburg, Karlheinz; Broll, Wolfgang; Groß, Horst-Michael; Werner, Stephan; Raake, Alexander
Saying "Hi" to grandma in nine different ways : established and innovative communication media in the grandparent-grandchild relationship. - In: Technology, Mind, and Behavior, ISSN 2689-0208, (2021), insges. 1 S.

https://doi.org/10.1037/tms0000107
Fremerey, Stephan; Reimers, Carolin; Leist, Larissa; Spilski, Jan; Klatte, Maria; Fels, Janina; Raake, Alexander
Generation of audiovisual immersive virtual environments to evaluate cognitive performance in classroom type scenarios. - In: Tagungsband, DAGA 2021 - 47. Jahrestagung für Akustik, (2021), S. 1336-1339

https://doi.org/10.22032/dbt.50292
Ramachandra Rao, Rakesh Rao; Göring, Steve; Raake, Alexander
Enhancement of pixel-based video quality models using meta-data. - In: Electronic imaging, ISSN 2470-1173, Bd. 33 (2021), 9, art00022, S. 264-1-264-6

Current state-of-the-art pixel-based video quality models for 4K resolution do not have access to explicit meta information such as resolution and framerate and may not include implicit or explicit features that model the related effects on perceived video quality. In this paper, we propose a meta concept to extend state-of-the-art pixel-based models and develop hybrid models incorporating meta-data such as framerate and resolution. Our general approach uses machine learning to incooperate the meta-data to the overall video quality prediction. To this aim, in our study, we evaluate various machine learning approaches such as SVR, random forest, and extreme gradient boosting trees in terms of their suitability for hybrid model development. We use VMAF to demonstrate the validity of the meta-information concept. Our approach was tested on the publicly available AVT-VQDB-UHD-1 dataset. We are able to show an increase in the prediction accuracy for the hybrid models in comparison with the prediction accuracy of the underlying pixel-based model. While the proof-of-concept is applied to VMAF, it can also be used with other pixel-based models.



https://doi.org/10.2352/ISSN.2470-1173.2021.9.IQSP-264
Ho, Man M.; Zhang, Lu; Raake, Alexander; Zhou, Jinjia
Semantic-driven colorization. - In: Proceedings CVMP 2021, (2021), 1, S. 1-10

Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images. Consequently, the generated color is easier to be overflowed, and the semantic faults are invisible. According to human experience in colorization, our brains first detect and recognize the objects in the photo, then imagine their plausible colors based on many similar objects we have seen in real life, and finally colorize them, as described in Figure 1. In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it. Thus, our work can provide plausible colors at a semantic level. Plus, the semantic information predicted from a well-trained model becomes understandable and able to be modified. Additionally, we also prove that Instance Normalization is also a missing ingredient for image colorization, then re-design the inference flow of U-Net to have two streams of data, providing an appropriate way of normalizing the features extracted from the black-and-white image. As a result, our network can provide plausible colors competitive to the typical colorization works for specific objects. Our interactive application is available at https://github.com/minhmanho/semantic-driven_colorization.



https://doi.org/10.1145/3485441.3485645
Keller, Dominik; Seybold, Tamara; Skowronek, Janto; Raake, Alexander
Sensorische Evaluierung in der Kinotechnik : wie Videoqualität mit Methoden aus der Lebensmittelforschung bewertet werden kann. - In: FKT, ISSN 1430-9947, Bd. 75 (2021), 4, S. 33-37

Ramachandra Rao, Rakesh Rao; Göring, Steve; Raake, Alexander
Towards high resolution video quality assessment in the crowd. - In: 2021 13th International Conference on Quality of Multimedia Experience (QoMEX), (2021), S. 1-6

Assessing high resolution video quality is usually performed using controlled, defined, and standardized lab tests. This method of acquiring human ratings in a lab environment is time-consuming and may also not reflect the typical viewing conditions. To overcome these disadvantages, crowd testing paradigms have been used for assessing video quality in general. Crowdsourcing-based tests enable a more diverse set of participants and also use a realistic hardware setup and viewing environment of typical users. However, obtaining valid ratings for high-resolution video quality poses several problems. Example issues are that streaming of such high-bandwidth content may not be feasible for some users, or that crowd participants lack an appropriate, high-resolution display device. In this paper, we propose a method to overcome such problems and conduct a crowd test using for higher resolution content by using a 540 p cutout from the center of the original 2160p video. To this aim, we use the videos from Test#1 of the publicly available dataset AVT-VQDB-UHD-1, which contains videos up to a resolution of UHD-1. The quality-labels available from that lab test allow us to compare the results with the crowd test presented in this paper. It is shown that there is a Pearson correlation of 0.96 between the lab and crowd tests and hence such crowd tests can reliably be used for video assessment of higher resolution content. The overall implementation of the crowd test framework and the results are made publicly available for further research and reproducibility1.



https://doi.org/10.1109/QoMEX51781.2021.9465425
Keller, Dominik; Vaalgamaa, Markus; Paajanen, Erkki; Ramachandra Rao, Rakesh Rao; Göring, Steve; Raake, Alexander
Groovability: using groove as a novel measure for audio QoE with the example of smartphones. - In: 2021 13th International Conference on Quality of Multimedia Experience (QoMEX), (2021), S. 13-18

Groove in music is a fundamental part of why humans entrain to it and enjoy it. Smartphones have become an important medium to listen to music. Especially when being with others, loudspeaker playback may be the method of choice. However, due to the physical limits of acoustics, for loudspeaker playback, smartphones are equipped with sub-optimal audio capabilities. Therefore, it is desirable to measure Quality of Experience (QoE) of music played on smartphones. While audio playback is often assessed in terms of sound quality, the aim of this work is to address QoE in terms of the meaning or effect that the audio has on the listener. A key component for the meaning of popular music is groove. Hence, in this paper, we study groovability, that is, the ability of a piece of audio technology to convey groove. To instantiate our novel audio QoE assessment method, we apply it to music played by 8 different smartphones. For this purpose, looped 4-bar loudness-aligned recordings from 24 music pieces of different intrinsic groove were played back on the different smartphones. Our test method uses a multi-stimulus comparison with synchronized playback capability. A total of 62 subjects evaluated groovability using two stimulus subsets. It was found that the proposed methodology is highly effective to distinguish between the groovability provided by the considered phones. In addition, a reduced-reference model is proposed to predict groovability, using a set of both acoustics-and music-groove related features. In our formal validation on unknown data, the model is shown to provide good prediction performance with a Pearson correlation of greater than 0.90.



https://doi.org/10.1109/QoMEX51781.2021.9465440
Robitza, Werner; Ramachandra Rao, Rakesh Rao; Göring, Steve; Raake, Alexander
Impact of spatial and temporal information on video quality and compressibility. - In: 2021 13th International Conference on Quality of Multimedia Experience (QoMEX), (2021), S. 65-68

Spatial Information (SI) and Temporal Information (TI) are frequently-used metrics to classify the spatiotemporal complexity of video content. However, they are mostly used on original video sources, and their impact on actual encoding efficiency is not known. In this paper, we propose a method to determine the compressibility of video sources, that is, how good video quality can be under a given bitrate constraint. We show how various aggregations of SI and TI correlate with compressibility scores obtained from a public dataset of H.264/HEVCN P9 content. We observe that the minimum TI value as well as an existing criticality metric from the literature are good indicators for compressibility, as judged by subjective ratings as well as VMAF and P.1204.3 objective scores.



https://doi.org/10.1109/QoMEX51781.2021.9465452
Ávila Soto, Mauro; Barzegar, Najmeh
I know you are looking to me: enabling eye-gaze communication between small children and parents with visual impairments. - In: AH 2021, (2021), 9, insges. 4 S.

Eye-gaze interaction is a relevant mean of communication from the early infancy. The bonding between infants and their care-takers is Strengthened through eye contact. Parents with visual impairments are excluded of this type of interaction with their children. Thus, nowadays computer vision technologies allow to track eye-gaze with different purposes, even users with visual impairments are enable to recognize faces. This work starts from the following research question: Can current available eye tracking solutions aid parents with visual impairments to have eye-gaze interaction with their young infants children? We devised a software prototype based on currently available eye tracking technologies which was tested with three sets of visually impaired parents and their young infant children to explore the possibility to assist those parents to have eye-gaze interaction with their children. The experience was documented as semi-structured interviews which were processed with a content analysis technique. The approach got positive feedback in the functionality and Emotional interaction aspects.



https://doi.org/10.1145/3460881.3460883
Singla, Ashutosh; Göring, Steve; Keller, Dominik; Ramachandra Rao, Rakesh Rao; Fremerey, Stephan; Raake, Alexander
Assessment of the simulator sickness questionnaire for omnidirectional videos. - In: 2021 IEEE Conference on Virtual Reality and 3D User Interfaces, (2021), S. 198-206

Virtual Reality/360˚ videos provide an immersive experience to users. Besides this, 360˚ videos may lead to an undesirable effect when consumed with Head-Mounted Displays (HMDs), referred to as simulator sickness/cybersickness. The Simulator Sickness Questionnaire (SSQ) is the most widely used questionnaire for the assessment of simulator sickness. Since the SSQ with its 16 questions was not designed for 360˚ video related studies, our research hypothesis in this paper was that it may be simplified to enable more efficient testing for 360˚ video. Hence, we evaluate the SSQ to reduce the number of questions asked from subjects, based on six different previously conducted studies. We derive the reduced set of questions from the SSQ using Principal Component Analysis (PCA) for each test. Pearson Correlation is analysed to compare the relation of all obtained reduced questionnaires as well as two further variants of SSQ reported in the literature, namely Virtual Reality Sickness Questionnaire (VRSQ) and Cybersickness Questionnaire (CSQ). Our analysis suggests that a reduced questionnaire with 9 out of 16 questions yields the best agreement with the initial SSQ, with less than 44% of the initial questions. Exploratory Factor Analysis (EFA) shows that the nine symptom-related attributes determined as relevant by PCA also appear to be sufficient to represent the three dimensions resulting from EFA, namely, Uneasiness, Visual Discomfort and Loss of Balance. The simplified version of the SSQ has the potential to be more efficiently used than the initial SSQ for 360˚ video by focusing on the questions that are most relevant for individuals, shortening the required testing time.



https://doi.org/10.1109/VR50410.2021.00041