Logo TU Ilmenau

You are here


Previous reports from the AVT lab

21st IEEE International Symposium on Multimedia (2019 IEEE ISM), Dec 9 - 11, 2019, San Diego, USA

Rakesh Rao Ramachandra Rao, Steve Göring, Werner Robitza, Bernhard Feiten, Alexander Raake

AVT-VQDB-UHD-1: A Large Scale Video Quality Database for UHD-1

4K television screens or even with higher resolutions are currently available in the market.Moreover video streaming providers are able to stream videos in 4K resolution and beyond.Therefore, it becomes increasingly important to have a proper understanding of video quality especially in case of 4K videos. To this effect, in this paper, we present a study of subjective and objective quality assessment of 4K ultra-high-definition videos of short duration, similar to DASH segment lengths.

As a first step, we conducted four subjective quality evaluation tests for compressed versions of the 4K videos. The videos were encoded using three different video codecs, namely H.264, HEVC, and VP9. The resolutions of the compressed videos ranged from 360p to 2160p with framerates varying from 15fps to 60fps. All the source 4K contents used were of 60fps. We included low-quality conditions in terms of bitrate, resolution and framerate to ensure that the tests cover a wide range of conditions, and that e.g. possible models trained on this data are more general and applicable to a wider range of real world applications. The results of the subjective quality evaluation are analyzed to assess the impact of different factors such as bitrate, resolution, framerate, and content.

In the second step, different state-of-the-art objective quality models were applied to all videos and their performance was analyzed in comparison with the subjective ratings, e.g. using Netflix's VMAF. The videos, subjective scores, both MOS and confidence interval per sequence and objective scores are made public for use by the community for further research.

Link to the videos:

21st IEEE International Workshop on Multimedia Signal Processing (MMSP), September 2019, Kuala Lumpur, Malaysia

A. Singla, W. Robitza and A. Raake

Comparison of Subjective Quality Test Methods for Omnidirectional Video Quality Evaluation

The test methods recommended by the International Telecommunication Union (ITU) for assessing 2D video quality are often used for evaluating omnidirectional / 360° videos. In this paper, we compare the performance of three different test methods, Absolute Category Rating (ACR), a modified version of ACR (M–ACR) with double presentation of the test stimulus, and DSIS (Double Stimulus Impairment Scale), based on the statistical reliability, assessment time and simulator sickness. Different settings were used for HEVC encoding of five 360° source videos of 10 s duration. Results indicate that DSIS is statistically more reliable with higher resolving power, followed by M–ACR and ACR. We found that simulator sickness increases with time, but can be reduced by taking breaks in between the test sessions. The results for simulator sickness are compared across test methods and with similar tests conducted under different contextual conditions. We also recorded and analyzed the exploration behaviour of the users. Apart from the methodological findings, the test results provide insights into video quality for different resolution and encoding settings (“bitrate ladders”). These may be useful for choosing appropriate representations in the context of HTTP-based adaptive streaming in case of full-frame streaming.

MOS with corresponding CIs for different test methods

Best Paper Award

Dominik Keller (AVT Group), Tamara Seybold (ARRI Munich), Janto Skowronek (former AVT Group) and Alexander Raake (AVT Group) got the Best Paper Award at the 11th International Conference on Quality of Multimedia Experience (QoMEX 2019) in Berlin.

You find the abstract of the article below.

The winners Domink Keller and Anton Schubert with the chairman of the Förderverein Prof. Seitz.

Prizes for Graduates of the AVT Group

For the second time, the Förderverein Elektrotechnik und Informationstechnik e. V. Ilmenau (Association for the Promotion of Electrical Engineering and Information Technology Ilmenau) in conjunction with the Department of Electrical Engineering and Information Technology of the TU Ilmenau presented its award for outstanding theses. The three endowed prizes honor the achievements of the students during the exmatriculation ceremony at the end of June. Fortunately, two master theses of the AVT group which were carried out with industrial partners were honored and awarded as outstanding due to their high degree of interdisciplinarity and scientific character as well as their execution.

We congratulate the award winners Anton Schubert, who has worked on the implementation of a compressed broadband audio codec for driver communication in motor sports, and Dominik Keller, who has worked on identification and analysis of texture dimensions in motion pictures using sensory evaluation techniques.

Dominik Keller, Tamara Seybold, Janto Skowronek, and Alexander Raake
Assessing Texture Dimensions and Video Quality in Motion Pictures using Sensory Evaluation Techniques

The paper resulting from the cooperation of members of the Audiovisual Technology Group and Scientific and Engineering Academy Award winner ARRI (Arnold & Richter Cine Technik) received Best Paper Award at this year’s 11th  Int. Conference on Quality of Multimedia Experience (QoMEX 2019).

The quality of images and videos is usually examined with well-established subjective tests or instrumental models. These often target content transmitted over the internet, such as streaming or videoconferences and address the human preferential experience. In the area of high-quality motion pictures, however, other factors are relevant. These mostly are not error-related but aimed at the creative image design, which has gained comparatively little attention in image and video quality research. To determine the perceptual dimensions underlying movie-type video quality, we combine sensory evaluation techniques extensively used in food assessment – Degree of Difference test and Free Choice Profiling – with more classical video quality tests. The main goal of this research is to analyze the suitability of sensory evaluation methods for high-quality video assessment. To understand which features in motion pictures are recognizable and critical to quality, we address the example of image texture properties, measuring human perception and preferences with a panel of image-quality experts. To this aim, different capture settings were simulated applying sharpening filters as well as digital and analog noise to exemplary source sequences. The evaluation, involving Multidimensional Scaling, Generalized Procrustes Analysis as well as Internal and External Preference Mapping, identified two separate perceptual dimensions. We conclude that Free Choice Profiling connected with a quality test offers the highest level of insight relative to the needed effort. The combination enables a quantitative quality measurement including an analysis of the underlying perceptual reasons.

External Preference Mapping results: Best ratings for stimuli of low noise and medium-high sharpness (Landscape scene)

In a study presented at the QoMEX 2019 conference, we compare the impact of various motion interpolation (MI) algorithms on 360° video Quality of Experience (QoE). For doing so, we conducted a subjective test with 12 video expert viewers, while a pair comparison test method was used. We interpolated four different 20 s long 30 fps 360° source contents to the native 90 Hz refresh rate of popular Head-Mounted Displays using three different MI algorithms. Subsequently, we compared these 90 fps videos against each other to investigate the influence on the QoE. Regarding the algorithms, we found out that ffmpeg blend does not lead to a significant improvement of QoE, while MCI and butterflow do so. Additionally, we concluded that for 360°  videos containing fast and sudden movements, MCI should be preferred over butterflow, while butterflow is more suitable for slow and medium motion videos. While comparing the time needed for rendering the 90 fps interpolated videos, ffmpeg blend is the fastest, while MCI and butterflow need much more time.

To top

Published in 26th IEEE Conference on Virtual Reality and 3D User Interfaces, March 2019, Osaka, Japan

A. Singla, R. R. R. Rao, S. Göring and A. Raake: Assessing Media QoE, Simulator Sickness and Presence for Omnidirectional Videos with Different Test Protocols

QoE for omnidirectional videos comprises additional components such as simulator sickness and presence. In this paper, a series of tests is presented comparing different test protocols to assess integral quality, simulator sickness and presence for omnidirectional videos in one test run, using the HTC Vive Pro as head-mounted display. For quality ratings, the five-point ACR scale was used. In addition, the well-established Simulator Sickness Questionnaire and PresenceQuestionnaire methods were used, once in a full version, and once with only one single integral scale, to analyze how well presence and simulator sickness can be captured using only a single scale.


Ashutosh Singla while presenting his poster at the IEEE VR conference in Japan

Eleventh International Conference on Quality of Multimedia Experience (QoMEX) (QoMEX 2019). Berlin, Germany. June 2019

Steve Göring, Rakesh Rao Ramachandra Rao, Alexander Raake

nofu - A Lightweight No-Reference Pixel Based Video Quality Model for Gaming Content

Popularity of streaming services for gaming videos has increased tremendously over the last years, e.g. Twitch and Youtube Gaming. Compared to classical video streaming applications, gaming videos have additional requirements. For example, it is important that videos are streamed live with only a small delay. In addition, users expect low stalling, waiting time and in general high video quality during streaming, e.g. using http-based adaptive streaming. These requirements lead to different challenges for quality prediction in case of streamed gaming videos. We describe newly developed features and a no-reference video quality machine learning model, that uses only the recorded video to predict video quality scores. In different evaluation experiments we compare our proposed model nofu with state-of-the-art reduced or full reference models and metrics.
In addition, we trained a no-reference baseline model using brisque+niqe features. We show that our model has a similar or better performance than other models. Furthermore, nofu outperforms VMAF for subjective gaming QoE prediction, even though nofu does not require any reference video.


scatter_plot_mos_nofu: results for gaming dataset and subjective score prediction


7th European Workshop on Visual Information Processing (EUVIP), Tampere (Finland), 26 - 28 November 2018 (

Steve Göring, Alexander Raake

deimeq – A Deep Neural Network Based Hybrid No-reference Image Quality Model

Current no reference image quality assessment models are mostly based on hand-crafted features (signal, computer vision, . . . ) or deep neural networks. Using DNNs for image quality prediction leads to several problems, e.g. the input size is restricted; higher resolutions will increase processing time and memory consumption. Large inputs are handled by image patching and aggregation a quality score. In a pure patching approach connections between the sub-images are getting lost.

Also, a huge dataset is required for training a DNN from scratch, though only small datasets with annotations are available. We provide a hybrid solution (deimeq) to predict image quality using

DNN feature extraction combined with random forest models. Firstly, deimeq uses a pre-trained DNN for feature extraction in a hierarchical sub-image approach, this avoids a huge training dataset. Further, our proposed sub-image approach circumvents a pure patching, because of hierarchical connections between the sub-images. Secondly, deimeq can be extended using signal-based features from state-of-the art models. To evaluate our approach, we choose a strict cross-dataset evaluation with the Live-2 and TID2013 datasets with several pre-trained DNNs. Finally, we show that deimeq and variants of it perform better or similar than other methods.

Picture: General approach for a classification in HD and UHD


Human Vision and Electronic Imaging 2019, Burlingame (California USA), 13 - 17  January 2019 (

Steve Göring, Julian Zebelein, Simon Wedel, Dominik Keller, Alexander Raake

Analyze And Predict the Perceptibility of UHD Video Contents

720p, Full-HD, 4K, 8K, ..., display resolutions are increasing heavily over the past time. However many video streaming providers are currently streaming videos with a maximum of 4K/UHD-1 resolution. Considering that normal video viewers are enjoying their videos in typical living rooms, where viewing distances are quite large, the question arises if more resolution is even recognizable. In the following paper we will analyze the problem of UHD perceptibility in comparison with lower resolutions. As a first step, we conducted a subjective video test, that focuses on short uncompressed video sequences and compares two different testing methods for pairwise discrimination of two representations of the same source video in different resolutions.

We selected an extended stripe method and a temporal switching method. We found that the temporal switching is more suitable to recognize UHD video content. Furthermore, we developed features, that can be used in a machine learning system to predict whether there is a benefit in showing a given video in UHD or not.

Evaluating different models based on these features for predicting perceivable differences shows good performance on the available test data. Our implemented system can be used to verify UHD source video material or to optimize streaming applications.

Preis für P. Lebreton, S. Fremerey und A. Raake

Im Rahmen der „Grand Challenge on Salient 360!“ auf der diesjährigen IEEE International Conference on Multimedia and Expo (ICME) in San Diego erhielten Dr. Pierre Lebreton (früher TU Ilmenau, jetzt Zhejiang Universitiy, China), Stephan Fremerey und Prof. Dr. Alexander Raake den Preis für den zweiten Platz in der Kategorie „Prediction of Head Saliency for Images“ und den vierten Platz in der Kategorie „Prediction of Head Saliency for Videos“. (

Example for a heatmap showing the head rotation data captured for all participants for one sequence
Image Appeal Prediction Pipeline
YouTube quality according to ITU-T P.1203 dependent on download speed and other factors.

New work on VR, image appeal, and video streaming presented at IEEE QoMEX, ACM MMSys

At this year’s ACM MMSys conference in Amsterdam, Stephan Fremerey has published a study and a related Open Source dataset and software. The study was done in collaboration with ARTE G.E.I.E. while the research goal was to get insights into the exploration behavior of 48 participants watching 20 different 30s long 360° videos in a task-free scenario. The dataset containing the Simulator Sickness Questionnaire scores, the head rotation data and the software to record and evaluate the respective data are published as Open Source (see

Steve Göring’s work on image appeal was presented at IEEE QoMEX. The core idea of the paper is to train a model for image liking prediction based on a crawled dataset of 80k images from a photo sharing platform. The used features are based on the image, social network analysis, comment analysis and other provided meta-data of images from such platforms.

On the topic of video streaming, Werner Robitza has published a study at QoMEX. The research was conducted in collaboration with Deutsche Telekom, where the quality of YouTube streams was measured under different bandwidth conditions. The paper shows the influence of different measurement scenarios on the measured key performance indicators and quality according to ITU-T P.1203, a standard for HTTP Adaptive Streaming quality estimation.

Together with academic and industry partners from Ericsson, Deutsche Telekom, TU Berlin, NTT Corporation, and NETSCOUT Systems, an open dataset and software for the ITU-T P.1203 standard was published at the ACM MMSys conference. The software can be used freely for non-commercial research purposes. You can find out more about the group’s work on P.1203 on this page.

To top

Prof. Dr. Alexander Raake in conversation with the Thuringian Prime Minister Bodo Ramelow (Photo: M. Döring)

IMT at the summer party of the Thuringian State Representation in Berlin

As in the previous year, the department was represented with demonstrations of its current research topics at the summer party of the Thuringian State Representation in Berlin. At the event in the afternoon and evening of June 26, 2018, interested visitors were able to find out about techniques of virtual reality and the mechanisms used for quality estimation at the booth of the TU Ilmenau. Also, after watching 360° videos, they could see which parts of the video they had actually explored.

Prominent visitor and interested guest was Thuringian Minister-President Bodo Ramelow. Prof. Raake explained the research goals in the field of VR technology.

Prof. Raake was supported during the event by Stephan Fremerey, Matthias Döring and Paul Krabbes from the Institute for Media Technology.

Prof. Schade (center) with the managing director Jürgen Burghardt and the chairman Siegfried Fößel of the FKTG

FKTG Honorary Membership for Prof. Schade

Prof. Dr.-Ing. Hans-Peter Schade, who was the head of the AVT laboratory from 2002 to 2015, was appointed honorary member of the Fernseh- und Kinotechnischen Gesellschaft (FKTG) - Television and Cinematographic Society - at the 28th FKTG Conference on June 4-6, 2018 in Nuremberg.

STEEM project was finished

In May 2018, the STEEM project (Speech Transmission End-to-End Monitoring) was successfully completed. Based on a large number of conversational tests, in the lab has developed an improved model for predicting the perceived quality of IP telephony. In addition to existing models, the influence of background noise and terminal device characteristics were included in the model. This model was handed over to the project partner HEAD acoustics. The solution will find its way into the products of the company in order to better enable network operators and component manufacturers to predict the voice quality perceived by the customer and to optimize their products and services with regard to customer requirements.

Partial results of the conversational tests carried out in Ilmenau were presented at the DAGA 2018. From this publication also comes the following figure.

Conversation test in the AVT test lab with different terminals and a system for defined background noise.

To top

Award for P. Lebreton and A. Raake at ICME 2017

Dr. Pierre Lebreton (former TU Ilmenau, now Zhejiang Universitiy, China) and Prof. Dr. Alexander Raake got the "Award for Best Head Movement Prediction" at the IEEE International Conference on Multimedia and Expo (ICME) 2017 in Hong Kong.

P.NATS Phase 1 - TU Ilmenau part of all winning groups

Building blocks of the ITU-T P.1203 model

The International Telecommunication Union - Telecommunication Standardization Sector (ITU-T) has recently published the P.1203 series of recommendations. This standard is the first which allows to measure the quality of different streaming services such as YouTube, Netflix etc. The standardized models were developed in a competition within the ITU-T Study Group 12 (SG12) which was previously referred to as the P.NATS (Parametric Non-intrusive Assessment of TCP-based multimedia Streaming quality) consisting of seven participating companies. The standardized models are available in four modes of operation depending on the different levels of media-related information used for analysis which in turn reflect the different types of media stream encryption. The models are developed to assess the quality of streaming services up to HD resolution. A modular model framework has been used to develop the models. This modular architecture is as shown in the figure.

TU Ilmenau in collaboration with Deutsche Telekom has been part of the winning groups for all the four modes.


Raake, A., Garcia, M.N., Robitza, W., List, P., Göring, S. and Feiten, B., 2017, May.
A bitstream-based, scalable video-quality model for HTTP adaptive streaming: ITU-T P. 1203.1.
In Quality of Multimedia Experience (QoMEX), 2017 Ninth International Conference on (pp. 1-6). IEEE.

Robitza, W., Garcia, M.N. and Raake, A., 2017, May.
A modular HTTP adaptive streaming QoE model—Candidate for ITU-T P. 1203 (“P. NATS”). In Quality of Multimedia Experience (QoMEX), 2017 Ninth International Conference on (pp. 1-6). IEEE.

Links to the standardization documents:
1) [Recommendation P.1203] -
2) [P.1203.1] -
3) [P.1203.2] -
4) [P.1203.3] -

Other links:

QoMEX 2017

In May 2017, the Audiovisual Technology Group of TU Ilmenau organized the 9th International Conference QoMEX 2017 in Erfurt. The conference brought together leading experts from academia and industry to present and discuss current research on multimedia quality, Quality of Experience (QoE) and user experience.

More information is available on the website:

Book "Quality of Experience"

The book "Quality of Experience" appeared in 2014. Alexander Raake is the co-editor, and many other former members of the AIPA team are co-authors of different chapters. The book:

  • develops the definition of "Quality of Experience" for telecommunication services and applies it to various fields of communication and media technology
  • provides examples and guidelines for many fields of application
  • • was written by well-known experts in the field

Information at Springer-Verlag

To top