Final Thesis: ETL Data Pipelines Configurations in Spark

Abstract: The JValue Open Data Service (ODS) is an ETL data pipeline that provides data extraction from different source systems (Extract), performs transformations on the extracted data (Transform), and loads the data to a target database (Load). There are different kinds of stream processing engines that cope with data that have high volume, variety, and velocity. Existing ETLs cannot be applied to different streaming services, and the use of various frameworks and programming languages brings complexity along. Among different streaming services, Apache Spark offers accelerated, reusable, and scalable ETLs. This thesis aims to suggest an approach to compile and configure a data pipeline and have it runnable on Apache Spark.

Keywords: ETL pipeline, stream processing

PDF: Bachelor Thesis

Reference: Gizem Batmaci. ETL Data Pipelines Configurations in Spark. Bachelor Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2022.

Final Thesis: Hierarchical Open Data Source Import for the JValue ODS

Abstract: Open Data has become more popular in the last few years due to its value to society. Governments, institutions, companies or individuals can make use of Open Data and add to economic growth or extract new knowledge from publicly available data. The Open Data Service (ODS) is a software developed by the Professorship of Open Source that aims to simplify the consumption of Open Data and make it more reliable.

The goal of this thesis is to extend the functionality of the ODS by the support of hierarchically structured data sources, in particular, File Transfer Protocol (FTP) based data sources. Due to the simplicity and reliability of the FTP, it is an appropriate solution for providing Open Data. This thesis aims to enable the user to explore and configure FTP data sources by developing a new microservice with a proof-of-concept user interface. As a result, consuming Open Data from FTP data sources is simplified and becomes more flexible.

Keywords: Open data, FTP, JValue ODS, microservices

PDF: Master Thesis

Reference: Benjamin Fischer. Hierarchical Open Data Source Import for the JValue ODS. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021.

Final Thesis: Fehlertoleranzanalyse von Microservice basierten Softwarearchitekturen – Konzept und Anwendung am JValue ODS

Abstract: Microservice-based software architectures play an essential role in building sizeable scalable cloud systems. The main advantage of microservices compared to the traditional software monoliths is the independent development, deployment, and scaling of the individual microservices, which allows innovations at a higher speed. Because microservice-based architectures are distributed systems, complexity is shifted from code to the network and communication layer. Therefore, additional failures like service outage or network connectivity loss arise, which must be tolerated to keep the system healthy and running. Within this thesis, a reusable concept is developed to analyse the fault tolerance of microservice-based software architectures. This allows for revealing weaknesses in the architecture that negatively affects the system’s reliability and resilience. For frequent problems, solution proposals are provided. The concept’s applicability and effectiveness are evaluated by applying it at the JValue Open Data Service (ODS). The analysis revealed several issues regarding the ODS’s fault tolerance, which could be fixed with the provided solutions.

Keywords: Microservices, fault tolerance, dependency graph, transactional outbox pattern

PDF: Master Thesis

Reference: Jonas Schüll. Fehlertoleranzanalyse von Microservice basierten Softwarearchitekturen – Konzept und Anwendung am JValue ODS. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021.