Publications of the Department of Audiovisual Technology

The following list (automatically generated by the University Library) contains the publications from the year 2016. The publications up to the year 2015 can be found on an extra page.

Note: If you want to search through all the publications, select "Show All" and then you can use the browser search with Ctrl+F.

Results: 162
Created on: Wed, 24 Apr 2024 23:02:47 +0200 in 0.0623 sec


Ávila Soto, Mauro; Barzegar, Najmeh
I know you are looking to me: enabling eye-gaze communication between small children and parents with visual impairments. - In: AH 2021, (2021), 9, insges. 4 S.

Eye-gaze interaction is a relevant mean of communication from the early infancy. The bonding between infants and their care-takers is Strengthened through eye contact. Parents with visual impairments are excluded of this type of interaction with their children. Thus, nowadays computer vision technologies allow to track eye-gaze with different purposes, even users with visual impairments are enable to recognize faces. This work starts from the following research question: Can current available eye tracking solutions aid parents with visual impairments to have eye-gaze interaction with their young infants children? We devised a software prototype based on currently available eye tracking technologies which was tested with three sets of visually impaired parents and their young infant children to explore the possibility to assist those parents to have eye-gaze interaction with their children. The experience was documented as semi-structured interviews which were processed with a content analysis technique. The approach got positive feedback in the functionality and Emotional interaction aspects.



https://doi.org/10.1145/3460881.3460883
Singla, Ashutosh; Göring, Steve; Keller, Dominik; Ramachandra Rao, Rakesh Rao; Fremerey, Stephan; Raake, Alexander
Assessment of the simulator sickness questionnaire for omnidirectional videos. - In: 2021 IEEE Conference on Virtual Reality and 3D User Interfaces, (2021), S. 198-206

Virtual Reality/360˚ videos provide an immersive experience to users. Besides this, 360˚ videos may lead to an undesirable effect when consumed with Head-Mounted Displays (HMDs), referred to as simulator sickness/cybersickness. The Simulator Sickness Questionnaire (SSQ) is the most widely used questionnaire for the assessment of simulator sickness. Since the SSQ with its 16 questions was not designed for 360˚ video related studies, our research hypothesis in this paper was that it may be simplified to enable more efficient testing for 360˚ video. Hence, we evaluate the SSQ to reduce the number of questions asked from subjects, based on six different previously conducted studies. We derive the reduced set of questions from the SSQ using Principal Component Analysis (PCA) for each test. Pearson Correlation is analysed to compare the relation of all obtained reduced questionnaires as well as two further variants of SSQ reported in the literature, namely Virtual Reality Sickness Questionnaire (VRSQ) and Cybersickness Questionnaire (CSQ). Our analysis suggests that a reduced questionnaire with 9 out of 16 questions yields the best agreement with the initial SSQ, with less than 44% of the initial questions. Exploratory Factor Analysis (EFA) shows that the nine symptom-related attributes determined as relevant by PCA also appear to be sufficient to represent the three dimensions resulting from EFA, namely, Uneasiness, Visual Discomfort and Loss of Balance. The simplified version of the SSQ has the potential to be more efficiently used than the initial SSQ for 360˚ video by focusing on the questions that are most relevant for individuals, shortening the required testing time.



https://doi.org/10.1109/VR50410.2021.00041
Göring, Steve; Ramachandra Rao, Rakesh Rao; Feiten, Bernhard; Raake, Alexander
Modular framework and instances of pixel-based video quality models for UHD-1/4K. - In: IEEE access, ISSN 2169-3536, Bd. 9 (2021), S. 31842-31864

https://doi.org/10.1109/ACCESS.2021.3059932
Göring, Steve; Steger, Robert; Ramachandra Rao, Rakesh Rao; Raake, Alexander
Automated genre classification for gaming videos. - In: IEEE 22nd International Workshop on Multimedia Signal Processing, (2020), insges. 6 S.

Besides classical videos, videos of gaming matches, entire tournaments or individual sessions are streamed and viewed all over the world. The increased popularity of Twitch or YoutubeGaming shows the importance of additional research on gaming videos. One important pre-condition for live or offline encoding of gaming videos is the knowledge of game-specific properties. Knowing or automatically predicting the genre of a gaming video enables a more advanced and optimized encoding pipeline for streaming providers, especially because gaming videos of different genres vary a lot from classical 2D video, e.g., considering the CGI content, textures or camera motion. We describe several computer-vision based features that are optimized for speed and motivated by characteristics of popular games, to automatically predict the genre of a gaming video. Our prediction system uses random forest and gradient boosting trees as underlying machine-learning techniques, combined with feature selection. For the evaluation of our approach we use a dataset that was built as part of this work and consists of recorded gaming sessions for 6 genres from Twitch. In total 351 different videos are considered. We show that our prediction approach shows a good performance in terms of f1-score. Besides the evaluation of different machine-learning approaches, we additionally investigate the influence of the hyper-parameters for the algorithms.



https://doi.org/10.1109/MMSP48831.2020.9287122
Singla, Ashutosh; Fremerey, Stephan; Hofmeyer, Frank; Robitza, Werner; Raake, Alexander
Quality assessment protocols for omnidirectional video quality evaluation. - In: Electronic imaging, ISSN 2470-1173, Bd. 32 (2020), 11, art00003, S. 069-1-069-6

In recent years, with the introduction of powerful HMDs such as Oculus Rift, HTC Vive Pro, the QoE that can be achieved with VR/360˚ videos has increased substantially. Unfortunately, no standardized guidelines, methodologies and protocols exist for conducting and evaluating - the quality of 360˚ videos in tests with human test subjects. In this paper, we present a set of test protocols for the evaluation of quality of 360˚ videos using HMDs. To this aim, we review the state-of-the-art with respect to the assessment of 360˚ videos summarizes their results. - Also, we summarize the methodological approaches and results taken for different subjective experiments at our lab under different contextual conditions. In the first two experiments 1a and 1b, the performance of two different subjective test methods, Double-Stimulus Impairment Scale (DSIS) - and Modified Absolute Category Rating (M-ACR) was compared under different contextual conditions. In experiment 2, the performance of three different subjective test methods, DSIS, M-ACR and Absolute Category Rating (ACR) was compared this time without varying the contextual conditions. Building - on the reliability and general applicability of the procedure across different tests, a methodological framework for 360˚ video quality assessment is presented in this paper. Besides video or media quality judgments, the procedure comprises the assessment of presence and simulator sickness, - for which different methods were compared. Further, the accompanying head-rotation data can be used to analyze both content- and quality-related behavioural viewing aspects. Based on the results, the implications of different contextual settings are discussed.



https://doi.org/10.2352/ISSN.2470-1173.2020.11.HVEI-069
Zadtootaghaj, Saman; Barman, Nabajeet; Ramachandra Rao, Rakesh Rao; Göring, Steve; Martini, Maria G.; Raake, Alexander; Möller, Sebastian
DEMI: deep video quality estimation model using perceptual video quality dimensions. - In: IEEE 22nd International Workshop on Multimedia Signal Processing, (2020), insges. 6 S.

Existing works in the field of quality assessment focus separately on gaming and non-gaming content. Along with the traditional modeling approaches, deep learning based approaches have been used to develop quality models, due to their high prediction accuracy. In this paper, we present a deep learning based quality estimation model considering both gaming and non-gaming videos. The model is developed in three phases. First, a convolutional neural network (CNN) is trained based on an objective metric which allows the CNN to learn video artifacts such as blurriness and blockiness. Next, the model is fine-tuned based on a small image quality dataset using blockiness and blurriness ratings. Finally, a Random Forest is used to pool frame-level predictions and temporal information of videos in order to predict the overall video quality. The light-weight, low complexity nature of the model makes it suitable for real-time applications considering both gaming and non-gaming content while achieving similar performance to existing state-of-the-art model NDNetGaming. The model implementation for testing is available on GitHub.



https://doi.org/10.1109/MMSP48831.2020.9287080
Fremerey, Stephan; Göring, Steve; Ramachandra Rao, Rakesh Rao; Huang, Rachel; Raake, Alexander
Subjective test dataset and meta-data-based models for 360˚ streaming video quality. - In: IEEE 22nd International Workshop on Multimedia Signal Processing, (2020), insges. 6 S.

During the last years, the number of 360˚ videos available for streaming has rapidly increased, leading to the need for 360˚ streaming video quality assessment. In this paper, we report and publish results of three subjective 360˚ video quality tests, with conditions used to reflect real-world bitrates and resolutions including 4K, 6K and 8K, resulting in 64 stimuli each for the first two tests and 63 for the third. As playout device we used the HTC Vive for the first and HTC Vive Pro for the remaining two tests. Video-quality ratings were collected using the 5-point Absolute Category Rating scale. The 360˚ dataset provided with the paper contains the links of the used source videos, the raw subjective scores, video-related meta-data, head rotation data and Simulator Sickness Questionnaire results per stimulus and per subject to enable reproducibility of the provided results. Moreover, we use our dataset to compare the performance of state-of-the-art full-reference quality metrics such as VMAF, PSNR, SSIM, ADM2, WS-PSNR and WS-SSIM. Out of all metrics, VMAF was found to show the highest correlation with the subjective scores. Further, we evaluated a center-cropped version of VMAF ("VMAF-cc") that showed to provide a similar performance as the full VMAF. In addition to the dataset and the objective metric evaluation, we propose two new video-quality prediction models, a bitstream meta-data-based model and a hybrid no-reference model using bitrate, resolution and pixel information of the video as input. The new lightweight models provide similar performance as the full-reference models while enabling fast calculations.



https://doi.org/10.1109/MMSP48831.2020.9287065
Ramachandra Rao, Rakesh Rao; Göring, Steve; Steger, Robert; Zadtootaghaj, Saman; Barman, Nabajeet; Fremerey, Stephan; Möller, Sebastian; Raake, Alexander
A large-scale evaluation of the bitstream-based video-quality model ITU-T P.1204.3 on gaming content. - In: IEEE 22nd International Workshop on Multimedia Signal Processing, (2020), insges. 6 S.

The streaming of gaming content, both passive and interactive, has increased manifolds in recent years. Gaming contents bring with them some peculiarities which are normally not seen in traditional 2D videos, such as the artificial and synthetic nature of contents or repetition of objects in a game. In addition, the perception of gaming content by the user is different from that of traditional 2D videos due to its pecularities and also the fact that users may not often watch such content. Hence, it becomes imperative to evaluate whether the existing video quality models usually designed for traditional 2D videos are applicable to gaming content. In this paper, we evaluate the applicability of the recently standardized bitstream-based video-quality model ITU-T P.1204.3 on gaming content. To analyze the performance of this model, we used 4 different gaming datasets (3 publicly available + 1 internal) not previously used for model training, and compared it with the existing state-of-the-art models. We found that the ITU P.1204.3 model out of the box performs well on these unseen datasets, with an RMSE ranging between 0.38 - 0.45 on the 5-point absolute category rating and Pearson Correlation between 0.85 - 0.93 across all the 4 databases. We further propose a full-HD variant of the P.1204.3 model, since the original model is trained and validated which targets a resolution of 4K/UHD-1. A 50:50 split across all databases is used to train and validate this variant so as to make sure that the proposed model is applicable to various conditions.



https://doi.org/10.1109/MMSP48831.2020.9287055
Fremerey, Stephan; Hofmeyer, Frank; Göring, Steve; Keller, Dominik; Raake, Alexander
Between the frames - evaluation of various motion interpolation algorithms to improve 360˚ video quality. - In: 2020 IEEE International Symposium on Multimedia, (2020), S. 65-72

With the increasing availability of 360˚ video content, it becomes important to provide smoothly playing videos of high quality for end users. For this reason, we compare the influence of different Motion Interpolation (MI) algorithms on 360˚ video quality. After conducting a pre-test with 12 video experts in [3], we found that MI is a useful tool to increase the QoE (Quality of Experience) of omnidirectional videos. As a result of the pretest, we selected three suitable MI algorithms, namely ffmpeg Motion Compensated Interpolation (MCI), Butterflow and Super-SloMo. Subsequently, we interpolated 15 entertaining and realworld omnidirectional videos with a duration of 20 seconds from 30 fps (original framerate) to 90 fps, which is the native refresh rate of the HMD used, the HTC Vive Pro. To assess QoE, we conducted two subjective tests with 24 and 27 participants. In the first test we used a Modified Paired Comparison (M-PC) method, and in the second test the Absolute Category Rating (ACR) approach. In the M-PC test, 45 stimuli were used and in the ACR test 60. Results show that for most of the 360˚ videos, the interpolated versions obtained significantly higher quality scores than the lower-framerate source videos, validating our hypothesis that motion interpolation can improve the overall video quality for 360˚ video. As expected, it was observed that the relative comparisons in the M-PC test result in larger differences in terms of quality. Generally, the ACR method lead to similar results, while reflecting a more realistic viewing situation. In addition, we compared the different MI algorithms and can conclude that with sufficient available computing power Super-SloMo should be preferred for interpolation of omnidirectional videos, while MCI also shows a good performance.



https://doi.org/10.1109/ISM.2020.00017
Raake, Alexander; Wierstorf, Hagen
Binaural evaluation of sound quality and quality of experience. - In: The technology of binaural understanding, (2020), S. 393-434

The chapter outlines the concepts of Sound Quality and Quality of Experience (QoE). Building on these, it describes a conceptual model of sound quality perception and experience during active listening in a spatial-audio context. The presented model of sound quality perception considers both bottom-up (signal-driven) as well as top-down (hypothesis-driven) perceptual functional processes. Different studies by the authors and from the literature are discussed in light of their suitability to help develop implementations of the conceptual model. As a key prerequisite, the underlying perceptual ground-truth data required for model training and validation are discussed, as well as means for deriving these from respective listening tests. Both feature-based and more holistic modeling approaches are analyzed. Overall, open research questions are summarized, deriving trajectories for future work on spatial-audio Sound Quality and Quality of Experience modeling.