Investigating user embodiment of inverse-kinematic avatars in smartphone Augmented Reality. - In: IEEE Xplore digital library, ISSN 2473-2001, (2022), S. 666-675
Smartphone Augmented Reality (AR) has already provided us with a plethora of social applications such as Pokemon Go or Harry Potter Wizards Unite. However, to enable smartphone AR for social applications similar to VRChat or AltspaceVR, proper user tracking is necessary to accurately animate the avatars. In Virtual Reality (VR), avatar tracking is rather easy due to the availability of hand-tracking, controllers, and HMD whereas smartphone AR has only the back-(and front) camera and IMUs available for this task. In this paper we propose ARIKA, a tracking solution for avatars in smartphone AR. ARIKA uses tracking information from ARCore to track the users hand position and to calculate a pose using Inverse Kinematics (IK). We compare the accuracy of our system against a commercial motion tracking system and compare both systems with respect to sense of agency, self-location, and body-ownership. For this, 20 participants observed their avatars in an augmented virtual mirror and executed a navigation and a pointing task. Our results show that participants felt a higher sense of agency and self location when using the full body tracked avatar as opposed to IK avatars. Interestingly and in favor of ARIKA, there were no significant differences in body-ownership between our solution and the full-body tracked avatars. Thus, ARIKA and it’s single-camera approach is valid solution for smartphone AR applications where body-ownership is essential.
Digital media in intergenerational communication: status quo and future scenarios for the grandparent-grandchild relationship. - In: Universal access in the information society, ISSN 1615-5297, Bd. 0 (2022), 0, insges. 16 S.
Communication technologies play an important role in maintaining the grandparent-grandchild (GP-GC) relationship. Based on Media Richness Theory, this study investigates the frequency of use (RQ1) and perceived quality (RQ2) of established media as well as the potential use of selected innovative media (RQ3) in GP-GC relationships with a particular focus on digital media. A cross-sectional online survey and vignette experiment were conducted in February 2021 among N = 286 university students in Germany (mean age 23 years, 57% female) who reported on the direct and mediated communication with their grandparents. In addition to face-to-face interactions, non-digital and digital established media (such as telephone, texting, video conferencing) and innovative digital media, namely augmented reality (AR)-based and social robot-based communication technologies, were covered. Face-to-face and phone communication occurred most frequently in GP-GC relationships: 85% of participants reported them taking place at least a few times per year (RQ1). Non-digital established media were associated with higher perceived communication quality than digital established media (RQ2). Innovative digital media received less favorable quality evaluations than established media. Participants expressed doubts regarding the technology competence of their grandparents, but still met innovative media with high expectations regarding improved communication quality (RQ3). Richer media, such as video conferencing or AR, do not automatically lead to better perceived communication quality, while leaner media, such as letters or text messages, can provide rich communication experiences. More research is needed to fully understand and systematically improve the utility, usability, and joy of use of different digital communication technologies employed in GP-GC relationships.
Neural network adaption for depth sensor replication. - In: The visual computer, ISSN 1432-2315, Bd. 38 (2022), 12, S. 4071-4081
In recent years, various depth sensors that are small enough to be used with mobile hardware have been introduced. They provide important information for use cases like 3D reconstruction or in the context of augmented reality where tracking and camera data alone would be insufficient. However, depth sensors may not always be available due to hardware limitations or when simulating augmented reality applications for prototyping purposes. In these cases, different approaches like stereo matching or depth estimation using neural networks may provide a viable alternative. In this paper, we therefore explore the imitation of depth sensors using deep neural networks. For this, we use a state-of-the-art network for depth estimation and adapt it in order to mimic a Structure Sensor as well as an iPad LiDAR sensor. We evaluate the network which was pre-trained on NYU V2 directly as well as several variations where transfer learning is applied in order to adapt the network to different depth sensors while using various data preprocessing and augmentation techniques. We show that a transfer learning approach together with appropriate data processing can enable an accurate modeling of the respective depth sensors.
Can communication technologies reduce loneliness and social isolation in older people? : a scoping review of reviews. - In: International journal of environmental research and public health, ISSN 1660-4601, Bd. 19 (2022), 18, 11310, S. 1-20
Background: Loneliness and social isolation in older age are considered major public health concerns and research on technology-based solutions is growing rapidly. This scoping review of reviews aims to summarize the communication technologies (CTs) (review question RQ1), theoretical frameworks (RQ2), study designs (RQ3), and positive effects of technology use (RQ4) present in the research field. Methods: A comprehensive multi-disciplinary, multi-database literature search was conducted. Identified reviews were analyzed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) framework. A total of N = 28 research reviews that cover 248 primary studies spanning 50 years were included. Results: The majority of the included reviews addressed general internet and computer use (82% each) (RQ1). Of the 28 reviews, only one (4%) worked with a theoretical framework (RQ2) and 26 (93%) covered primary studies with quantitative-experimental designs (RQ3). The positive effects of technology use were shown in 55% of the outcome measures for loneliness and 44% of the outcome measures for social isolation (RQ4). Conclusion: While research reviews show that CTs can reduce loneliness and social isolation in older people, causal evidence is limited and insights on innovative technologies such as augmented reality systems are scarce.
AR in TV: design and evaluation of mid-air gestures for moderators to control augmented reality applications in TV. - In: 20th International Conference on Mobile and Ubiquitous Multimedia, (2022), S. 137-147
Recent developments in augmented reality for TV productions encouraged broadcasters to enhance interaction with virtual content for moderators. However, traditional interaction methods are considered distracting and not intuitive. To overcome these issues, we performed a gesture elicitation study with a follow-up evaluation. For this, we considered TV moderators as primary users of the gestures as well as viewers as recipients. The elicited gesture set consists of five gestures for two types of camera shots (long shot and close shot). Findings of the evaluation study indicate that the derived set of gestures requires low physical and concentration effort from moderators. Also, both moderators and viewers found them appropriate to be used in TV with respect to understandability, distraction, likeability, and appropriateness. Using these gestures would allow moderators to control AR content in TV and tell stories in a modern and more expressive way.
Virtual and augmented reality (VR/AR) : foundations and methods of extended realities (XR). - Cham, Switzerland : Springer, 2022. - x, 429 Seiten ISBN 3-030-79061-4
Stereoscopic 3D dashboards : an investigation of performance, workload, and gaze behavior during take-overs in semi-autonomous driving. - In: Personal and ubiquitous computing, ISSN 1617-4917, Bd. 26 (2022), 3, S. 697-719
When operating a conditionally automated vehicle, humans occasionally have to take over control. If the driver is out of the loop, a certain amount of time is necessary to gain situation awareness. This work evaluates the potential of stereoscopic 3D (S3D) dashboards for presenting smart S3D take-over-requests (TORs) to support situation assessment. In a driving simulator study with a 4 × 2 between-within design, we presented 3 smart TORs showing the current traffic situation and a baseline TOR in 2D and S3D to 52 participants doing the n-back task. We further investigate if non-standard locations affect the results. Take-over performance indicates that participants looked at and processed the TORs' visual information and by that, could perform more safe take-overs. S3D warnings in general, as well as warnings appearing at the participants’ focus of attention and warnings at the instrument cluster, performed best. We conclude that visual warnings, presented on an S3D dashboard, can be a valid option to support take-over while not increasing workload. We further discuss participants’ gaze behavior in the context of visual warnings for automotive user interfaces.
OUTSIDE: multi-scale semantic segmentation of universal outdoor scenes. - In: IEEE 23rd International Workshop on Multimedia Signal Processing, (2021), insges. 6 S.
Semantic segmentation aims at providing a fine-grained image prediction by assigning each pixel to a specific semantic category. Convolutional neural networks offer significant benefits for solving this problem. However, the success of such networks is closely related to the availability of corresponding data sets. To facilitate semantic segmentation in a broader range of scenarios, such as augmented reality in outdoor environments or universal image-to-image translation, adequate training data sets are necessary. We present OUTSIDE15k, a large-scale data set for semantic segmentation of universal outdoor scenes. The data is labeled with 24 different semantic classes. The images contain multiple outdoor scenarios and cover a variety of different resolutions. Additionally, we present OUTSIDE-Net, an improved neural network architecture integrating multi-level pooling, feature fusion, and a spatial mask for semantic segmentation of universal outdoor scenes. It extracts spatial and semantic features from the input images to perform the segmentation. With the presented data set, we show the capability of our network which outperforms state-of-the-art approaches by achieving up to 91.5% pixel accuracy.
Saying "Hi" to grandma in nine different ways : established and innovative communication media in the grandparent-grandchild relationship. - In: Technology, Mind, and Behavior, ISSN 2689-0208, (2021), insges. 1 S.
Sensor simulation for monocular depth estimation using deep neural networks. - In: 2021 International Conference on Cyberworlds, (2021), S. 9-16
Depth estimation is one of the basic building blocks for scene understanding. In the case of monocular depth estimation using neural networks, many such approaches are highly hardware dependent because they result in a task- and environment-specific optimizing problem. Most DNN methods use commonly available datasets which leads to overfitting on particular sensor properties. Finding a generalized model with the consideration of different hardware properties of sensors and platforms is challenging if not impossible. For this reason, it is desirable to adapt existing and well-trained models into a new domain in order to let them simulate different depth sensors without the need for large datasets and time-consuming learning. Therefore, a small dataset has been created with the Structure Sensor for evaluating the transferable structural characteristic between neural networks. Finally, two input feature representations for the neural networks are considered to mimic the depth sensor including its artifacts including holes. The results show that a simple domain adaptation technique and a small dataset are adequate to simulate and adapt to a specific domain from a target domain. Therefore, the network is able to accurately predict depth maps as if they were created by a specific depth sensor. This also includes unique artifacts of the sensor, thereby allowing for a plausible simulation of specific depth sensing hardware which is beneficial for areas like prototyping in the context of Augmented Reality.