Estimating food ingredient compositions based on mandatory product labeling. - In: Journal of food composition and analysis, ISSN 0889-1575, Bd. 110 (2022), 104508, S. 1-9
Having a specific understanding of the actual ingredient composition of products helps to calculate additional nutritional information, such as containing fatty and amino acids, minerals and vitamins, as well as to determine its environmental impacts. Unfortunately, producers rarely provide information on how much of each ingredient is in a product. Food manufacturers are, however, required to declare their products in terms of a label comprising an ingredient list (in descending order) and Big7 nutrient values. In this paper, we propose an automated approach for estimating ingredient contents in food products. First, we parse product labels to extract declared ingredients. Next, we exert mathematical formulations on the assumption that the weighted sum of Big7 ingredients as available from food compositional tables should resemble the product’s declared overall Big7 composition. We apply mathematical optimization techniques to find the best fitting ingredient composition estimate. We apply the proposed method to a dataset of 1804 food products spanning 11 product categories. We find that 76% of these products could be analyzed by our approach, and a composition within the prescribed nutrient tolerances could be calculated, using 20% of the allowed tolerances per Big7 ingredient on average. The remaining 24% of the food products could still be estimated when relaxing one or multiple nutrient tolerances. A study with known ingredient compositions shows that estimates are within a 0.9% difference of products’ actual recipes. Hence, the automated approach presented here allows for further analysis of large product quantities and provides possibilities for more intensive nutritional and ecological evaluations of food.
https://doi.org/10.1016/j.jfca.2022.104508
SEOSS-Queries - a software engineering dataset for text-to-SQL and question answering tasks. - In: Data in Brief, ISSN 2352-3409, Bd. 42 (2022), 108211, S. 1-11
https://doi.org/10.1016/j.dib.2022.108211
Deep learning in plant phenological research: a systematic literature review. - In: Frontiers in plant science, ISSN 1664-462X, Bd. 13 (2022), 805738, S. 1-18
Climate change represents one of the most critical threats to biodiversity with far-reaching consequences for species interactions, the functioning of ecosystems, or the assembly of biotic communities. Plant phenology research has gained increasing attention as the timing of periodic events in plants is strongly affected by seasonal and interannual climate variation. Recent technological development allowed us to gather invaluable data at a variety of spatial and ecological scales. The feasibility of phenological monitoring today and in the future depends heavily on developing tools capable of efficiently analyzing these enormous amounts of data. Deep Neural Networks learn representations from data with impressive accuracy and lead to significant breakthroughs in, e.g., image processing. This article is the first systematic literature review aiming to thoroughly analyze all primary studies on deep learning approaches in plant phenology research. In a multi-stage process, we selected 24 peer-reviewed studies published in the last five years (2016-2021). After carefully analyzing these studies, we describe the applied methods categorized according to the studied phenological stages, vegetation type, spatial scale, data acquisition- and deep learning methods. Furthermore, we identify and discuss research trends and highlight promising future directions. We present a systematic overview of previously applied methods on different tasks that can guide this emerging complex research field.
https://doi.org/10.3389/fpls.2022.805738
Direct data-driven forecast of local turbulent heat flux in Rayleigh-Bénard convection. - In: Physics of fluids, ISSN 1089-7666, Bd. 34 (2022), 4, S. 045106-1-045106-14
A combined convolutional autoencoder-recurrent neural network machine learning model is presented to directly analyze and forecast the dynamics and low-order statistics of the local convective heat flux field in a two-dimensional turbulent Rayleigh-Bénard convection flow at Prandtl number Pr=7 and Rayleigh number Ra=10^7. Two recurrent neural networks are applied for the temporal advancement of turbulent heat transfer data in the reduced latent data space, an echo state network, and a recurrent gated unit. Thereby, our work exploits the modular combination of three different machine learning algorithms to build a fully data-driven and reduced model for the dynamics of the turbulent heat transfer in a complex thermally driven flow. The convolutional autoencoder with 12 hidden layers is able to reduce the dimensionality of the turbulence data to about 0.2% of their original size. Our results indicate a fairly good accuracy in the first- and second-order statistics of the convective heat flux. The algorithm is also able to reproduce the intermittent plume-mixing dynamics at the upper edges of the thermal boundary layers with some deviations. The same holds for the probability density function of the local convective heat flux with differences in the far tails. Furthermore, we demonstrate the noise resilience of the framework. This suggests that the present model might be applicable as a reduced dynamical model that delivers transport fluxes and their variations to coarse grids of larger-scale computational models, such as global circulation models for atmosphere and ocean.
https://doi.org/10.1063/5.0087977
Graph based mining of code change patterns from version control commits. - In: IEEE transactions on software engineering, ISSN 1939-3520, Bd. 48 (2022), 3, S. 848-863
Detailed knowledge of frequently recurring code changes can be beneficial for a variety of software engineering activities. For example, it is a key step to understand the process of software evolution, but is also necessary when developing more sophisticated code completion features predicting likely changes. Previous attempts on automatically finding such code change patterns were mainly based on frequent itemset mining, which essentially finds sets of edits occurring in close proximity. However, these approaches do not analyze the interplay among code elements, e.g., two code objects being named similarly, and thereby neglect great potential in identifying a number of meaningful patterns. We present a novel method for the automated mining of code change patterns from Git repositories that captures these context relations between individual edits. Our approach relies on a transformation of source code into a graph representation, while keeping relevant relations present. We then apply graph mining techniques to extract frequent subgraphs, which can be used for further analysis of development projects. We suggest multiple usage scenarios for the resulting pattern type. Additionally, we propose a transformation into complex event processing (CEP) rules which allows for easier application, especially for event-based auto-completion recommenders or similar tools. For evaluation, we mined seven open-source code repositories. We present 25 frequent change patterns occurring across these projects. We found these patterns to be meaningful, easy to interpret and mostly persistent across project borders. On average, a pattern from our set appeared in 45 percent of the analyzed code changes.
https://doi.org/10.1109/TSE.2020.3004892
StickyLocalization: robust end-to-end relocalization on point clouds using graph neural networks. - In: IEEE Xplore digital library, ISSN 2473-2001, (2022), S. 307-316
Relocalization inside pre-built maps provides a big benefit in the course of today’s autonomous driving tasks where the map can be considered as an additional sensor for refining the estimated current pose of the vehicle. Due to potentially large drifts in the initial pose guess as well as maps containing unfiltered dynamic and temporal static objects (e.g. parking cars), traditional methods like ICP tend to fail and show high computation times. We propose a novel and fast relocalization method for accurate pose estimation inside a pre-built map based on 3D point clouds. The method is robust against inaccurate initialization caused by low performance GPS systems and tolerates the presence of unfiltered objects by specifically learning to extract significant features from current scans and adjacent map sections. More specifically, we introduce a novel distance-based matching loss enabling us to simultaneously extract important information from raw point clouds and aggregating inner- and inter-cloud context by utilizing self- and cross-attention inside a Graph Neural Network. We evaluate StickyLocalization’s (SL) performance through an extensive series of experiments using two benchmark datasets in terms of Relocalization on NuScenes and Loop Closing using KITTI’s Odometry dataset. We found that SL outperforms state-of-the art point cloud registration and relocalization methods in terms of transformation errors and runtime.
https://doi.org/10.1109/WACV51458.2022.00038
PRECODE - a generic model extension to prevent deep gradient leakage. - In: IEEE Xplore digital library, ISSN 2473-2001, (2022), S. 3605-3614
Collaborative training of neural networks leverages distributed data by exchanging gradient information between different clients. Although training data entirely resides with the clients, recent work shows that training data can be reconstructed from such exchanged gradient information. To enhance privacy, gradient perturbation techniques have been proposed. However, they come at the cost of reduced model performance, increased convergence time, or increased data demand. In this paper, we introduce PRECODE, a PRivacy EnhanCing mODulE that can be used as generic extension for arbitrary model architectures. We propose a simple yet effective realization of PRECODE using variational modeling. The stochastic sampling induced by variational modeling effectively prevents privacy leakage from gradients and in turn preserves privacy of data owners. We evaluate PRECODE using state of the art gradient inversion attacks on two different model architectures trained on three datasets. In contrast to commonly used defense mechanisms, we find that our proposed modification consistently reduces the attack success rate to 0% while having almost no negative impact on model training and final performance. As a result, PRECODE reveals a promising path towards privacy enhancing model extensions.
https://doi.org/10.1109/WACV51458.2022.00366
Image-based automated recognition of 31 Poaceae species: the most relevant perspectives. - In: Frontiers in plant science, ISSN 1664-462X, Bd. 12 (2022), 804140, S. 1-12
Poaceae represent one of the largest plant families in the world. Many species are of great economic importance as food and forage plants while others represent important weeds in agriculture. Although a large number of studies currently address the question of how plants can be best recognized on images, there is a lack of studies evaluating specific approaches for uniform species groups considered difficult to identify because they lack obvious visual characteristics. Poaceae represent an example of such a species group, especially when they are non-flowering. Here we present the results from an experiment to automatically identify Poaceae species based on images depicting six well-defined perspectives. One perspective shows the inflorescence while the others show vegetative parts of the plant such as the collar region with the ligule, adaxial and abaxial side of the leaf and culm nodes. For each species we collected 80 observations, each representing a series of six images taken with a smartphone camera. We extract feature representations from the images using five different convolutional neural networks (CNN) trained on objects from different domains and classify them using four state-of-the art classification algorithms. We combine these perspectives via score level fusion. In order to evaluate the potential of identifying non-flowering Poaceae we separately compared perspective combinations either comprising inflorescences or not. We find that for a fusion of all six perspectives, using the best combination of feature extraction CNN and classifier, an accuracy of 96.1% can be achieved. Without the inflorescence, the overall accuracy is still as high as 90.3%. In all but one case the perspective conveying the most information about the species (excluding inflorescence) is the ligule in frontal view. Our results show that even species considered very difficult to identify can achieve high accuracies in automatic identification as long as images depicting suitable perspectives are available. We suggest that our approach could be transferred to other difficult-to-distinguish species groups in order to identify the most relevant perspectives.
https://doi.org/10.3389/fpls.2021.804140
Propagating frugal user feedback through closeness of code dependencies to improve IR-based traceability recovery. - In: Empirical software engineering, ISSN 1573-7616, Bd. 27 (2022), 2, 41, insges. 53 S.
Traceability recovery captures trace links among different software artifacts (e.g., requirements and code) when two artifacts cover the same part of system functionalities. These trace links provide important support for developers in software maintenance and evolution tasks. Information Retrieval (IR) is now the mainstream technique for semi-automatic approaches to recover candidate trace links based on textual similarities among artifacts. The performance of IR-based traceability recovery is evaluated by the ranking of relevant traces in the generated lists of candidate links. Unfortunately, this performance is greatly hindered by the vocabulary mismatch problem between different software artifacts. To address this issue, a growing body of enhancing strategies based on user feedback is proposed to adjust the calculated IR values of candidate links after the user verifies part of these links. However, the improvement brought by this kind of strategies requires a large amount of user feedback, which could be infeasible in practice. In this paper, we propose to improve IR-based traceability recovery by propagating a small amount of user feedback through the closeness analysis on call and data dependencies in the code. Specifically, our approach first iteratively asks users to verify a small set of candidate links. The collected frugal feedback is then composed with the quantified functional similarity for each code dependency (called closeness) and the generated IR values to improve the ranking of unverified links. An empirical evaluation based on nine real-world systems with three mainstream IR models shows that our approach can outperform five baseline approaches by using only a small amount of user feedback.
https://doi.org/10.1007/s10664-021-10091-5
Deep security analysis of program code : a systematic literature review. - In: Empirical software engineering, ISSN 1573-7616, Bd. 27 (2022), 1, 2, insges. 39 S.
Due to the continuous digitalization of our society, distributed and web-based applications become omnipresent and making them more secure gains paramount relevance. Deep learning (DL) and its representation learning approach are increasingly been proposed for program code analysis potentially providing a powerful means in making software systems less vulnerable. This systematic literature review (SLR) is aiming for a thorough analysis and comparison of 32 primary studies on DL-based vulnerability analysis of program code. We found a rich variety of proposed analysis approaches, code embeddings and network topologies. We discuss these techniques and alternatives in detail. By compiling commonalities and differences in the approaches, we identify the current state of research in this area and discuss future directions. We also provide an overview of publicly available datasets in order to foster a stronger benchmarking of approaches. This SLR provides an overview and starting point for researchers interested in deep vulnerability analysis on program code.
https://doi.org/10.1007/s10664-021-10029-x
Multi-task near-field perception for autonomous driving using surround-view fisheye cameras. - Ilmenau : Universitätsbibliothek, 2021. - 1 Online-Ressource (xxv, 219 Seiten)
Technische Universität Ilmenau, Dissertation 2021
Literaturverzeichnis: Seite 183-219
Die Bildung der Augen führte zum Urknall der Evolution. Die Dynamik änderte sich von einem primitiven Organismus, der auf den Kontakt mit der Nahrung wartete, zu einem Organismus, der durch visuelle Sensoren gesucht wurde. Das menschliche Auge ist eine der raffiniertesten Entwicklungen der Evolution, aber es hat immer noch Mängel. Der Mensch hat über Millionen von Jahren einen biologischen Wahrnehmungsalgorithmus entwickelt, der in der Lage ist, Autos zu fahren, Maschinen zu bedienen, Flugzeuge zu steuern und Schiffe zu navigieren. Die Automatisierung dieser Fähigkeiten für Computer ist entscheidend für verschiedene Anwendungen, darunter selbstfahrende Autos, Augmented Realität und architektonische Vermessung. Die visuelle Nahfeldwahrnehmung im Kontext von selbstfahrenden Autos kann die Umgebung in einem Bereich von 0-10 Metern und 360˚ Abdeckung um das Fahrzeug herum wahrnehmen. Sie ist eine entscheidende Entscheidungskomponente bei der Entwicklung eines sichereren automatisierten Fahrens. Jüngste Fortschritte im Bereich Computer Vision und Deep Learning in Verbindung mit hochwertigen Sensoren wie Kameras und LiDARs haben ausgereifte Lösungen für die visuelle Wahrnehmung hervorgebracht. Bisher stand die Fernfeldwahrnehmung im Vordergrund. Ein weiteres wichtiges Problem ist die begrenzte Rechenleistung, die für die Entwicklung von Echtzeit-Anwendungen zur Verfügung steht. Aufgrund dieses Engpasses kommt es häufig zu einem Kompromiss zwischen Leistung und Laufzeiteffizienz. Wir konzentrieren uns auf die folgenden Themen, um diese anzugehen: 1) Entwicklung von Nahfeld-Wahrnehmungsalgorithmen mit hoher Leistung und geringer Rechenkomplexität für verschiedene visuelle Wahrnehmungsaufgaben wie geometrische und semantische Aufgaben unter Verwendung von faltbaren neuronalen Netzen. 2) Verwendung von Multi-Task-Learning zur Überwindung von Rechenengpässen durch die gemeinsame Nutzung von initialen Faltungsschichten zwischen den Aufgaben und die Entwicklung von Optimierungsstrategien, die die Aufgaben ausbalancieren.
https://doi.org/10.22032/dbt.50751