Trash or treasure? Machine-learning based PCB layout anomaly detection with AnoPCB. - In: SMACD/ PRIME 2021, (2021), S. 48-51
FisheyeDistanceNet++: self-supervised fisheye distance estimation with self-attention, robust loss function and camera view generalization. - In: Electronic imaging, ISSN 2470-1173, Bd. 2021 (2021), S. 181-1-181-10
FisheyeDistanceNet  proposed a self-supervised monocular depth estimation method for fisheye cameras with a large field of view (> 180˚). To achieve scale-invariant depth estimation, FisheyeDistanceNet supervises depth map predictions over multiple scales during training. To overcome this bottleneck, we incorporate self-attention layers and robust loss function  to FisheyeDistanceNet. A general adaptive robust loss function helps obtain sharp depth maps without a need to train over multiple scales and allows us to learn hyperparameters in loss function to aid in better optimization in terms of convergence speed and accuracy. We also ablate the importance of Instance Normalization over Batch Normalization in the network architecture. Finally, we generalize the network to be invariant to camera views by training multiple perspectives using front, rear, and side cameras. Proposed algorithm improvements, FisheyeDistanceNet++, result in 30% relative improvement in RMSE while reducing the training time by 25% on the WoodScape dataset. We also obtain state-of-the-art results on the KITTI dataset, in comparison to other self-supervised monocular methods.
StickyPillars: robust and efficient feature matching on point clouds using graph neural networks. - In: IEEE Xplore digital library, ISSN 2473-2001, (2021), S. 313-323
Robust point cloud registration in real-time is an important prerequisite for many mapping and localization algorithms. Traditional methods like ICP tend to fail without good initialization, insufficient overlap or in the presence of dynamic objects. Modern deep learning based registration approaches present much better results, but suffer from a heavy runtime. We overcome these drawbacks by introducing StickyPillars, a fast, accurate and extremely robust deep middle-end 3D feature matching method on point clouds. It uses graph neural networks and performs context aggregation on sparse 3D key-points with the aid of transformer based multi-head self and cross-attention. The network output is used as the cost for an optimal transport problem whose solution yields the final matching probabilities. The system does not rely on hand crafted feature descriptors or heuristic matching strategies. We present state-of-art art accuracy results on the registration problem demonstrated on the KITTI dataset while being four times faster then leading deep methods. Furthermore, we integrate our matching system into a LiDAR odometry pipeline yielding most accurate results on the KITTI odometry dataset. Finally, we demonstrate robustness on KITTI odometry. Our method remains stable in accuracy where state-of-the-art procedures fail on frame drops and higher speeds.
BEVDetNet: Bird's Eye View LiDAR point cloud based real-time 3D object detection for autonomous driving. - In: IEEE Xplore digital library, ISSN 2473-2001, (2021), S. 2809-2915
3D object detection based on LiDAR point clouds is a crucial module in autonomous driving particularly for long range sensing. Most of the research is focused on achieving higher accuracy and these models are not optimized for deployment on embedded systems from the perspective of latency and power efficiency. For high speed driving scenarios, latency is a crucial parameter as it provides more time to react to dangerous situations. Typically a voxel or point-cloud based 3D convolution approach is utilized for this module. Firstly, they are inefficient on embedded platforms as they are not suitable for efficient parallelization. Secondly, they have a variable runtime due to level of sparsity of the scene which is against the determinism needed in a safety system. In this work, we aim to develop a very low latency algorithm with fixed runtime. We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction using binned classification in a simpler Bird's Eye View (BEV) 2D representation. The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation. The proposed model has a latency of 4 ms on the embedded Nvidia Xavier platform. The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at IoU = 0.5 on KITTI dataset.
Development methodologies for safety critical machine learning applications in the automotive domain: a survey. - In: IEEE Xplore digital library, ISSN 2473-2001, (2021), S. 129-141
Enabled by recent advances in the field of machine learning, the automotive industry pushes towards automated driving. The development of traditional safety-critical automotive software is subject to rigorous processes, ensuring its dependability while decreasing the probability of failures. However, the development and training of machine learning applications substantially differs from traditional software development. The processes and methodologies traditionally prescribed are unfit to account for specifics like, e.g., the importance of datasets for a development. We perform a systematic mapping study surveying methodologies proposed for the development of machine learning applications in the automotive domain. We map the identified primary publications to a general machine learning-based development process and preliminary assess their maturity. The reviewss goal is providing a holistic view of current and previous research contributing to ML-aware development processes and identifying challenges that need more attention. Additionally, we list methods, network architectures, and datasets used within these publications. Our meta-study identifies that model training and model V&V received by far the most research attention accompanied by the most mature evaluations. The remaining development phases, concerning domain specification, data management, and model integration, appear underrepresented and in need of more thorough research. Additionally, we identify and aggregate typically methods applied when developing automated driving applications like models, datasets and simulators showing the state of practice in this field.
Crowd-sourced plant occurrence data provide a reliable description of macroecological gradients. - In: Ecography, ISSN 1600-0587, Bd. 44 (2021), 8, S. 1131-1142
Deep learning algorithms classify plant species with high accuracy, and smartphone applications leverage this technology to enable users to identify plant species in the field. The question we address here is whether such crowd-sourced data contain substantial macroecological information. In particular, we aim to understand if we can detect known environmental gradients shaping plant co-occurrences. In this study we analysed 1 million data points collected through the use of the mobile app Flora Incognita between 2018 and 2019 in Germany and compared them with Florkart, containing plant occurrence data collected by more than 5000 floristic experts over a 70-year period. The direct comparison of the two data sets reveals that the crowd-sourced data particularly undersample areas of low population density. However, using nonlinear dimensionality reduction we were able to uncover macroecological patterns in both data sets that correspond well to each other. Mean annual temperature, temperature seasonality and wind dynamics as well as soil water content and soil texture represent the most important gradients shaping species composition in both data collections. Our analysis describes one way of how automated species identification could soon enable near real-time monitoring of macroecological patterns and their changes, but also discusses biases that must be carefully considered before crowd-sourced biodiversity data can effectively guide conservation measures.
SmartPIV: flow velocity estimates by smartphones for education and field studies. - In: Experiments in fluids, ISSN 1432-1114, Bd. 62 (2021), 8, 172, S. 1-13
In this paper, a smartphone application is presented that was developed to lower the barrier to introduce particle image velocimetry (PIV) in lab courses. The first benefit is that a PIV system using smartphones and a continuous wave (cw-) laser is much cheaper than a conventional system and thus much more affordable for universities. The second benefit is that the design of the menus follows that of modern camera apps, which are intuitively used. Thus, the system is much less complex and costly than typical systems, and our experience showed that students have much less reservations to work with the system and to try different parameters. Last but not least the app can be applied in the field. The relative uncertainty was shown to be less than 8%, which is reasonable for quick velocity estimates. An analysis of the computational time necessary for the data evaluation showed that with the current implementation the app is capable of providing smooth live display vector fields of the flow. This might further increase the use of modern measurement techniques in industry and education.
Reactive auto-completion of modeling activities. - In: IEEE transactions on software engineering, ISSN 1939-3520, Bd. 47 (2021), 7, S. 1431-1451
Assisting and automating software engineering tasks is a state-of-the-art way to support stakeholders of development projects. A common assistance function of IDEs is the auto-completion of source code. Assistance functions, such as auto-completion, are almost entirely missing in modeling tools though auto-completion in general gains continuously more importance in software development. We analyze a users performed editing operations in order to anticipate modeling activities and to recommend appropriate auto-completions for them. Editing operations are captured as events and modeling activities are defined as complex event patterns, facilitating the matching by complex-event-processing. The approach provides adapted auto-completions reactively upon each editing operation of the user. We implemented the RapMOD prototype as add-in for the modeling tool Sparx Enterprise Architect . A controlled user experiment with 37 participants performing modeling tasks demonstrated the approach's potential to reduce modeling effort significantly. Users having auto-completions available for a modeling scenario performed the task 27 percent faster, needed to perform 56 percent less actions, and perceived the task 29 percent less difficult.
The Flora Incognita app - interactive plant species identification. - In: Methods in ecology and evolution, ISSN 2041-210X, Bd. 12 (2021), 7, S. 1335-1342
Being able to identify plant species is an important factor for understanding biodiversity and its change due to natural and anthropogenic drivers. We discuss the freely available Flora Incognita app for Android, iOS and Harmony OS devices that allows users to interactively identify plant species and capture their observations. Specifically developed deep learning algorithms, trained on an extensive repository of plant observations, classify plant images with yet unprecedented accuracy. By using this technology in a context-adaptive and interactive identification process, users are now able to reliably identify plants regardless of their botanical knowledge level. Users benefit from an intuitive interface and supplementary educational materials. The captured observations in combination with their metadata provide a rich resource for researching, monitoring and understanding plant diversity. Mobile applications such as Flora Incognita stimulate the successful interplay of citizen science, conservation and education.
Defocus particle tracking : a comparison of methods based on model functions, cross-correlation, and neural networks. - In: Measurement science and technology, ISSN 1361-6501, Bd. 32 (2021), 9, 094011, insges. 14 S.
Defocus particle tracking (DPT) has gained increasing importance for its use to determine particle trajectories in all three dimensions with a single-camera system, as typical for a standard microscope, the workhorse of todays ongoing biomedical revolution. DPT methods derive the depth coordinates of particle images from the different defocusing patterns that they show when observed in a volume much larger than the respective depth of field. Therefore it has become common for state-of-the-art methods to apply image recognition techniques. Two of the most commonly and widely used DPT approaches are the application of (astigmatism) particle image model functions (MF methods) and the normalized cross-correlations between measured particle images and reference templates (CC methods). Though still young in the field, the use of neural networks (NN methods) is expected to play a significant role in future and more complex defocus tracking applications. To assess the different strengths of such defocus tracking approaches, we present in this work a general and objective assessment of their performances when applied to synthetic and experimental images of different degrees of astigmatism, noise levels, and particle image overlapping. We show that MF methods work very well in low-concentration cases, while CC methods are more robust and provide better performance in cases of larger particle concentration and thus stronger particle image overlap. The tested NN methods generally showed the lowest performance, however, in comparison to the MF and CC methods, they are yet in an early stage and have still great potential to develop within the field of DPT.
SynDistNet: self-supervised monocular fisheye camera distance estimation synergized with semantic segmentation for autonomous driving. - In: 2021 IEEE Winter Conference on Applications of Computer Vision, (2021), S. 61-71
State-of-the-art self-supervised learning approaches for monocular depth estimation usually suffer from scale ambiguity. They do not generalize well when applied on distance estimation for complex projection models such as in fisheye and omnidirectional cameras. This paper introduces a novel multi-task learning strategy to improve self-supervised monocular distance estimation on fisheye and pinhole camera images. Our contribution to this work is threefold: Firstly, we introduce a novel distance estimation network architecture using a self-attention based encoder coupled with robust semantic feature guidance to the decoder that can be trained in a one-stage fashion. Secondly, we integrate a generalized robust loss function, which improves performance significantly while removing the need for hyperparameter tuning with the reprojection loss. Finally, we reduce the artifacts caused by dynamic objects violating static world assumptions using a semantic masking strategy. We significantly improve upon the RMSE of previous work on fisheye by 25% reduction in RMSE. As there is little work on fisheye cameras, we evaluated the proposed method on KITTI using a pinhole model. We achieved state-of-the-art performance among self-supervised methods without requiring an external scale estimation.
OmniDet: surround view cameras based multi-task visual perception network for autonomous driving. - In: IEEE Robotics and automation letters, ISSN 2377-3766, Bd. 6 (2021), 2, S. 2830-2837
Surround View fisheye cameras are commonly deployed in automated driving for 360˚ near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. We demonstrate that the jointly trained model performs better than the respective single task versions. Our multi-task model has a shared encoder providing a significant computational advantage and has synergized decoders where tasks support each other. We propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion model both at training and inference. This was crucial to enable training on the WoodScape dataset, comprised of data from different parts of the world collected by 12 different cameras mounted on three different cars with different intrinsics and viewpoints. Given that bounding boxes is not a good representation for distorted fisheye images, we also extend object detection to use a polygon with non-uniformly sampled vertices. We additionally evaluate our model on standard automotive datasets, namely KITTI and Cityscapes. We obtain the state-of-the-art results on KITTI for depth estimation and pose estimation tasks and competitive performance on the other tasks. We perform extensive ablation studies on various architecture choices and task weighting methodologies. A short video at https://youtu.be/xbSjZ5OfPes provides qualitative results.
Synaptic scaling - an artificial neural network regularization inspired by nature. - In: IEEE transactions on neural networks and learning systems, ISSN 2162-237X, (2021), S. 1-15
Multi-view classification with convolutional neural networks. - In: PLOS ONE, ISSN 1932-6203, Bd. 16 (2021), 1, e0245230, insges. 17 S.
Pollen analysis using multispectral imaging flow cytometry and deep learning. - In: The new phytologist, ISSN 1469-8137, Bd. 229 (2021), 1, S. 593-606
Pollen identification and quantification are crucial but challenging tasks in addressing a variety of evolutionary and ecological questions (pollination, paleobotany), but also for other fields of research (e.g. allergology, honey analysis or forensics). Researchers are exploring alternative methods to automate these tasks but, for several reasons, manual microscopy is still the gold standard. In this study, we present a new method for pollen analysis using multispectral imaging flow cytometry in combination with deep learning. We demonstrate that our method allows fast measurement while delivering high accuracy pollen identification. A dataset of 426 876 images depicting pollen from 35 plant species was used to train a convolutional neural network classifier. We found the best-performing classifier to yield a species-averaged accuracy of 96%. Even species that are difficult to differentiate using microscopy could be clearly separated. Our approach also allows a detailed determination of morphological pollen traits, such as size, symmetry or structure. Our phylogenetic analyses suggest phylogenetic conservatism in some of these traits. Given a comprehensive pollen reference database, we provide a powerful tool to be used in any pollen study with a need for rapid and accurate species identification, pollen grain quantification and trait extraction of recent pollen.