
At the 16th ACM Multimedia Systems Conference (MMSys'25), which took place from March 31 to April 4, 2025 in Stellenbosch, South Africa, the paper "AMIS: An Audiovisual Dataset for Multimodal XR Research" was presented in the Open Source Software and Datasets (ODS) track. The focus was on the practical implementation of the AMIS dataset, which was developed specifically for research in immersive media communication and social XR environments.
The dataset is based on a systematic recording strategy in which audiovisual content was recorded synchronously in four modalities: Talking-head videos, full-body videos, personalized animated avatars, and volumetric avatars. To create the 3D avatars, real people were recorded in a controlled studio environment using specialized camera and tracking systems. Based on these recordings, movement and facial expression data was extracted and used to animate the avatars.
To create the 3D avatars, real people were recorded in a controlled studio environment using specialized camera and tracking systems. Based on these recordings, movement and facial expression data was extracted and used to animate the avatars.
The recorded material includes both monologues and dyadic dialogues to depict a wide range of realistic interaction scenarios - from classic video conferences to immersive XR meetings with lifelike, personalized avatars. The aim of the project is to provide a modular, reusable and publicly accessible database for research into future forms of communication in virtual and augmented realities.
Further details can be found in the publication: https://dl.acm.org/doi/abs/10.1145/3712676.3718344 Access to the data set: https://github.com/Telecommunication-Telemedia-Assessment/AMIS