Bachelor- und Masterarbeiten am Fachgebiet DBIS
Wir bieten ständig Themen für Bachelor-, Master, Studien- und Diplomarbeiten an. Die Aufgabenstellungen sind in der Regel eingebunden in Forschungsarbeiten des Fachgebiets, stammen aber auch aus Kooperationen mit Unternehmen wie IBM, SAP sowie Thüringer IT-Unternehmen.
Wir erwarten:
- sehr gute Vorkenntnisse im Datenbankbereich (etwa durch die erfolgreiche Teilnahme an den Vertiefungsveranstaltungen des Fachgebiets),
- gute Englischkenntnisse (die Arbeit selbst kann natürlich in n deutscher Sprache angefertigt werden, aber die Fachliteratur ist komplett englisch),
- die Bereitschaft, sich überdurchschnittlich bei der Bearbeitung zu engagieren,
- die Fähigkeit zur Teamarbeit.
Wir bieten:
- interessante und anspruchsvolle Themen, die nicht selten zu wissenschaftlichen Veröffentlichungen auf internationalen Veranstaltungen führen,
- eine intensive und kompetente Betreuung: Neben der individuellen Konsultation bieten wir ein wöchentliches Oberseminar an, dass allen Studierenden die Möglichkeit der Präsentation und Diskussion ihrer Probleme und Ergebnisse gibt.
Ablauf
Sie interessieren sich für ein Thema?
- Kontaktieren Sie den potentiellen Betreuer für weitergehende Informationen.
- Sie erstellen ein kurzes Exposé bestehend aus einer kurzen Problembeschreibung, einer Lösungsskizze, eventuellen Evaluierungsszenarien und - ganz wichtig - einem Arbeitsplan.
- Auf der Basis dieses Exposés entscheiden wir über die Vergabe und Anmeldung des Themas.
- Die Bearbeitung beginnt mit einem kurzen Vorstellungsvortrag im Rahmen unseres Oberseminars.
Aktuelle Themenangebote
Die folgenden Themen sind gegenwärtig verfügbar und können sowohl in Deutsch oder Englisch bearbeitet werden.
| Thema | Betreuer |
Indexing Temporal Data The time dimension plays an important role for discovering knowledge in large data sets. For example, YAGO2 allows exploring linked open data (LOD) sources that contain information tagged with timestamps. To support search operations on these data sets, efficient indexing schemes for different representations of time data, like dates or intervals, have to be implemented. In the scope of a bachelor or master thesis, different temporal index structures shall be analyzed and compared according to their applicability in large LOD datasets. Requirements: basic knowledge of database management systems / indexing techniques, C++/Java skills | Felix Beier |
Implementation and Evaluation of a Compression Algorithm in an IBM MMDBS The IBM DB2 Analytics Accelerator is a newly developed, cluster-based high-performance main memory database system, designed to improve the performance of analytical warehouse workloads. Within this context, data compression algorithms are crucial techniques for handling the huge data volume that is typical for data warehousing workloads. The task for a research project, internship, or Bachelor/Master thesis is to implement and evaluate a newly designed compression algorithm that allows a one-pass data compression as well as query evaluation directly on compressed attributes. Requirements: basic knowledge of database management / data warehousing techniques, C++ skills | Felix Beier |
Analysis and Development of a Decision Model in an IBM MMDBS The IBM DB2 Analytics Accelerator is an optional extension to existing mainframe databases to improve the query performance of scan intensive workloads with the help of a workload-optimized system. The primary database (DB2 for z/OS) requests scan services from this accelerator that then returns the filtered results for a specific query. The underlying technology and cluster architecture could provide for additional use case scenarios like model creation for analytics but these use cases are not available through the DB2 for z/OS interface yet. The thesis is about developing a DB2 stored procedure that communicates with the accelerator extension that then used existing analytical functionality to build a decision tree model as a result of data analysis. This model then has to be returned back into the DB2 environment. From a customer perspective, the invocation and the results are looking like DB2 operations and tables but underneath, the existing accelerator technology is used to execute the model creation on optimized hardware. The thesis therefore requires the implementation of a prototype stored procedure and extending the existing accelerator code. It will also require evaluating the potential cost reductions by offloading the model creation into the accelerator (incl. costs for data transfer between DB2 and accelerator) vs. native model creation within DB2 to figure out if the offload approach is feasible. Requirements: C++ and object oriented programming, Database skills (possibly also mainframe databases), Linux as operating system and development environment | Felix Beier |
Development and Evaluation of a Storage Manager for a Rendering Engine For displaying virtual 3-D scenes on computer screens, it is required that the models are transformed into 2-D images. Since 3-D scenes that are typically used, e.g., in computer games or CAD engineering contain millions or billions of triangles, this process called rendering is a time consuming task. This poses challenges to the underlying data structures and algorithms for producing images in real-time. For efficient data access and object filtering, index structures are mandatory. They can speed up complex rendering tasks like calculating the path of light rays through the scene with translating them into simple spatial point and intersection queries. Within this context, the task of a research project is to isolate the data access layer of an open source rendering system to provide the possibility of plugging-in other data storage and indexing techniques. Requirements: basic knowledge of database management techniques, C++ skills, basic knowledge of computer graphics lectures | Felix Beier |
Efficient In-Memory Processing of SPARQL Queries on Large Datasets The task of this project is to design and implement operators for executing SPARQL queries on compressed in-memory data structures. The goal is to build a query execution engine capable of handling very huge linked data sets such as DBpedia. The work will focus on operators like scan, filter, and join as well as their efficient implementation exploiting characteristics of modern CPUs. A detailed experimental evaluation using realistic datasets is required. Requirements: knowledge of data structures and query processing, C++ programming skills | Prof. Sattler |
Update-friendly Compression Techniques for Linked Data In this project you will investigate techniques for compressing huge volumes of linked data represented as RDF triples. To goal of this work is to select and evaluate techniques such as dictionary and run-length encoding for achieving high compression rates for in-memory data structures while allowing efficient updates as well as processing basic query operators without decompressing data before. Requirements: knowledge on data structures and query processing, C++ programming skills | Prof. Sattler |
Spatial Indexing of Linked Data The task of this project is to select, evaluate, and implement data structures and techniques (e.g. R-trees) for indexing spatial components of RDF data. The goal is to support the analysis and processing of linked data sets such as DBpedia in terms of spatial properties and relationships. Requirements: knowledge on data structures and query processing, C++ programming skills | Prof. Sattler |
Query Rewriting for SPARQL queries Given is a cloud infrastructure with nodes. Nodes are connected according to the Chord approach. Every node is responsible for a chunk of data. If a complex SPARQL query is sent to the system, the receiver becomes the coordinator and has to rewrite the query into basic SPARQL queries (SELECT, one WHERE clause and optional FILTER statements) which can be answered by single nodes. The coordinator collects the results and combines them into one result. The main goal of the thesis is to develop an efficient method for rewriting SPARQL queries. The rewriting should be done by utilizing index structures and the Chord ring. As rewriting tool, TOM (http://tom.loria.fr) should be used. At the end of the thesis, a Java program should process the rewriting automatically. Also the developed methods have to be evaluated. Skills in Java, in the area of distributed computing, and software development are required. | |
| Assembling and maintaining a distributed dictionary inside a Cloud infrastructure Given is a cloud infrastructure with nodes. Nodes are connected according to the Chord approach. Every node is responsible for a chunk of data. If a complex SPARQL query is sent to the system, the receiver becomes the coordinator and has to rewrite the query into basic SPARQL queries (SELECT, one WHERE clause and optional FILTER statements) which can be answered by single nodes. For processing simple queries, a (distributed) dictionary is needed, because every row from the requested table contains only keys. The dictionary owns the mapping back to string or real data. At the end, the coordinator collects the results and combines them into one result. The main goal of the thesis is to develop a method for assembling and maintaining a distributed dictionary. In addition, the dictionary should take care of index structures, compression technics, and the Chord ring. The dictionary should also take care of data types and SPARQL queries with regular expressions. This allows for extended SPARQL queries. At the end of the thesis, a Java program should process the bootstrapping and maintenance automatically. Also the developed methods have to be evaluated. Skills in Java, in the area of distributed computing, and software development are required. | Francis Gropengießer |
Laufende und abgeschlossene Arbeiten
Auf dieser Seite finden Sie eine Liste der laufenden und abgeschlossenen Arbeiten der letzten Jahre.
Latex-Vorlagen
Hier erhalten sie Vorlagen für Bachelor- und Masterarbeiten, die den Einstieg in die Dokumenterstellung mit LaTeX erleichtern sollen. Dies sind keine verbindlichen Vorlagen, sondern sollen viel mehr eine Richtlinie für in LaTeX unerfahrene Studierende bieten.


