Bitte beachten Sie, dass die Hochschulbibliographie den Datenstand 31.07.2024 hat.
Alle neueren Einträge finden Sie in der Universitätsbibliographie der Technischen Universität Ilmenau (TUUniBib).

Anzahl der Treffer: 331
Erstellt: Sat, 14 Sep 2024 12:06:20 +0200 in 0.0906 sec


Lasch, Robert;
Heterogeneous memory technologies in database management systems. - Ilmenau : Universitätsbibliothek, 2024. - 1 Online-Ressource (xii, 168 Seiten)
Technische Universität Ilmenau, Dissertation 2024

Datenbanksysteme sind ein fundamentaler Baustein der modernen IT-Landschaft. Ihr Betrieb trägt häufig deutlich zu den Gesamtkosten von IT-Systemen bei. Folglich besteht hoher ökonomischer Druck die Kosteneffizienz von Datenbanksystemen zu verbessern. Währenddessen ist ein großer Teil der Betriebskosten von Datenbanksystemen auf ihren Hauptspeicherverbrauch zurückzuführen. Verbesserungen der Speichereffizienz führen somit direkt zu höherer Kosteneffizienz. Die zurzeit meistgenutzte grundlegende Speichertechnologie ist DRAM. Allerdings wurden im Laufe der Zeit einige heterogene technologische Alternativen entwickelt. In dieser Arbeit präsentieren wir zwei Ansätze, um die Speichereffizienz von Datenbanksystemen zu verbessern. Zuerst untersuchen wir die Nutzbarkeit von heterogenen Speichertechnologien als ein kosteneffizientere Alternative zu aktuellen Systemen die ausschließlich teuren DRAM-basierten Speicher verwenden. Dazu entwickeln wir ein Kostenmodell zur Platzierung von Daten in Systemen mit hybridem Speicher. Wir stellen fest, dass ein In-Memory Datenbanksystem einen Großteil seiner Basisdaten mit nur geringen Leistungseinbußen in langsameren, günstigeren Speicher platzieren kann. Zweitens führt der konkurrierende Speicherbedarf für Basis- und temporäre Daten in typischen Datenbanksystemen häufig zu einer Überdimensionierung der Speicherkapazität. Somit stellen wir etwas überraschend fest, dass auch ohne den Einsatz heterogener Speichertechnologien noch erhebliche Steigerungen der Speichereffizienz möglich sind. Dazu stellen wir einen kooperativen Ansatz zur Bewältigung von widersprüchlichem Speicherbedarf vor. Wir vergleichen den kooperativen Ansatz mit dem traditionell in bestehenden Systemen verwendeten Ansatz mithilfe eines Prototyps. In unserer Evaluation stellen wir fest, dass kooperative Speicherverwaltung den Gesamtspeicherbedarf eines Datenbanksystems erheblich verringern kann, insbesondere für gemischte Workloads. Letztlich können die beiden in dieser Arbeit skizzierten Wege als solide Grundlage dienen, um die Kosteneffizienz von Datenbanksystemen deutlich zu steigern.



https://doi.org/10.22032/dbt.62299
Bedini, Francesco; Räth, Timo; Maschotta, Ralph; Sattler, Kai-Uwe; Zimmermann, Armin
Automated transformation of a domain-specific language for system modeling to Stochastic Colored Petri Nets. - In: IEEE Xplore digital library, ISSN 2473-2001, (2024), insges. 8 S.

Petri Net models are widely recognized for their ability to analyze concurrent, stochastic processes based on a solid mathematical foundation. However, one drawback of Petri Nets is their low-level abstraction: they offer only a few basic elements like places and transitions to represent all system components. While this limitation may not be an issue when working with small models, it becomes challenging when attempting to model larger processes or systems. As the complexity increases, the number of elements in the Petri Net also grows, making it difficult to distinguish and maintain them effectively. Furthermore, Petri Nets require verification to ensure that they accurately represent the behavior of the system they are intended to model. This verification process must be repeated whenever a model is created or modified. To address these challenges, this paper describes a Stochastic colored Petri Net semantics of a domain-specific language that allows modeling time-based hardware and software systems. We have developed a custom Eclipse-based framework that allows for both graphical and textual modeling, providing editors with useful features such as real-time validation of model constraints, which is not feasible at the low-level Petri Net abstraction due to the lack of contextual information. The DSL also offers the advantage of easy conversion from other modeling languages thanks to an intermediate language. From the model, valid Stochastic Colored Petri Nets (SCPNs) can be generated, which can automatically simulate certain system properties consistently. This approach aims to enhance modeling capabilities and alleviate some of the limitations associated with traditional Petri Nets.



https://doi.org/10.1109/SysCon61195.2024.10553543
Baumstark, Alexander; Paradies, Marcus; Sattler, Kai-Uwe; Kläbe, Steffen; Baumann, Stephan
So far and yet so near - accelerating distributed joins with CXL. - In: 20th International Workshop on Data Management on New Hardware (DaMoN 2024), (2024), 7, insges. 9 S.

Distributed partitioned joins are one of the most expensive operators in distributed DBMSs where a major part of the execution is attributed to network transfer costs. Although high-speed network technologies, such as RDMA, can lower this cost, they still come with significantly higher latency than local DRAM access. The emerging CXL interconnect protocol promises to provide direct and cache-coherent access to remote memory while offering byte-addressable memory access without CPU intervention. For short-distance communication in distributed DBMSs, CXL represents an interesting alternative for low-latency requirements. In this work, we explore how CXL can be leveraged for engine-internal communication and data exchange. We discuss and apply communication strategies to distributed joins. We emulate various CXL characteristics based on optimistic and pessimistic assumptions on the real performance of upcoming CXL devices and evaluate their impact on the execution of distributed joins. Our results show that CXL has the potential to improve distributed join performance.



https://doi.org/10.1145/3662010.3663449
Steinmetz, Nadine;
Entity linking for KGQA using AMR graphs. - In: The semantic web, (2023), S. 122-138

Entity linking is an essential part of analytical systems for question answering on knowledge graphs (KGQA). The mentioned entity has to be spotted in the text and linked to the correct resource in the knowledge graph (KG). With this paper, we present our approach on entity linking using the abstract meaning representation (AMR) of the question to spot the surface forms of entities. We re-trained AMR models with automatically generated training data. Based on these models, we extract surface forms and map them to an entity dictionary of the desired KG. For the disambiguation process, we evaluated different options and configurations on QALD-9 and LC-QuaD 2.0. The results of the best performing configurations outperform existing entity linking approaches.



https://doi.org/10.1007/978-3-031-33455-9_8
Jibril, Muhammad Attahir; Al-Sayeh, Hani; Baumstark, Alexander; Sattler, Kai-Uwe
Fast and efficient update handling for graph H2TAP. - In: Proceedings 26th International Conference on Extending Database Technology (EDBT 2023), (2023), S. 723-736

http://dx.doi.org/10.48786/edbt.2023.60
Kläbe, Steffen; Sattler, Kai-Uwe
Patched multi-key partitioning for robust query performance. - In: Proceedings 26th International Conference on Extending Database Technology (EDBT 2023), (2023), S. 324-336

http://dx.doi.org/10.48786/edbt.2023.26
Kläbe, Steffen; Hagedorn, Stefan; Sattler, Kai-Uwe
Exploration of approaches for in-database ML. - In: Proceedings 26th International Conference on Extending Database Technology (EDBT 2023), (2023), S. 311-322

http://dx.doi.org/10.48786/edbt.2023.25
Baumstark, Alexander; Jibril, Muhammad Attahir; Sattler, Kai-Uwe
Temporal graph processing in modern memory hierarchies. - In: Advances in databases and information systems, (2023), S. 103-116

Updates in graph DBMS lead to structural changes in the graph over time with different intermediate states. These intermediate states in a DBMS and the time when the actions to the actual data take place can be processed using temporal DBMSs. Most DBMSs built their temporal features based on their non-temporal processing and storage without considering the memory hierarchy of the underlying system. This leads to slower temporal processing and poor storage utilization. In this paper, we propose a storage and processing strategy for (bi-) temporal graphs using temporal materialized views (TMV) while exploiting the memory hierarchy of a modern system. Further, we show a solution to the query containment problem for certain types of temporal graph queries. Finally, we evaluate the overhead and performance of the presented approach. The results show that using TMV reduces the runtime of temporal graph queries while using less memory.



https://doi.org/10.1007/978-3-031-42914-9_8
Schlegel, Marius; Sattler, Kai-Uwe
Extracting provenance of machine learning experiment pipeline artifacts. - In: Advances in databases and information systems, (2023), S. 238-251

Experiment management systems (EMSs), such as MLflow, are increasingly used to streamline the collection and management of machine learning (ML) artifacts in iterative and exploratory ML experiment workflows. However, EMSs typically suffer from limited provenance capabilities rendering it hard to analyze the provenance of ML artifacts and gain knowledge for improving experiment pipelines. In this paper, we propose a comprehensive provenance model compliant with the W3C PROV standard, which captures the provenance of ML experiment pipelines and their artifacts related to Git and MLflow activities. Moreover, we present the tool MLflow2PROV that extracts provenance graphs according to our model from existing projects enabling collected pipeline provenance information to be queried, analyzed, and further processed.



https://doi.org/10.1007/978-3-031-42914-9_17
Räth, Timo; Onah, Ngozichukwuka; Sattler, Kai-Uwe
Interactive data cleaning for real-time streaming applications. - In: HILDA '23, (2023), 13, insges. 3 S.

The importance of data cleaning systems has continuously grown in recent years. Especially for real-time streaming applications, it is crucial, to identify and possibly remove anomalies in the data on the fly before further processing. The main challenge however lies in the construction of an appropriate data cleaning pipeline, which is complicated by the dynamic nature of streaming applications. To simplify this process and help data scientists to explore and understand the incoming data, we propose an interactive data cleaning system for streaming applications. In this paper, we list requirements for such a system and present our implementation to overcome the stated issues. Our demonstration shows, how a data cleaning pipeline can be interactively created, executed, and monitored at runtime. We also present several different tools, such as the automated advisor and the adaptive visualizer, that engage the user in the data cleaning process and help them understand the behavior of the pipeline.



https://doi.org/10.1145/3597465.3605229