Final Thesis: Giving Structure to Open Data in the JValue ODS

Abstract: Nowadays the internet provides a lot of open data for public use. Those can be written in various data types and cover plenty of subjects. Because of that the absence of a standard results into the main problem. Every provider can decide for himself how the data is constructed.

The JValue project is dedicated to this problem and aims to be the central point where those open data are gathered and optimized. Currently the JValue Open- Data-Service (ODS) provides the extraction, transformation and retrieving of open data supporting numerous protocols and data formats.

However until now there is only a very generic interface for the retrieval of those open data since the system currently ignores any data structure. In addition to that any provider can alter their data structure and upload it after the adjustment process, since they are not bound to any restrictions. This can lead to major restrictions or even the loss of the data gathering process.

To counteract this behavior a process shall be introduced, which allows the ODS to structure those open data. Furthermore a schema recommendation for the data should be generated, which then will be the foundation of the remaining data gathering process.

As a consequence of the introduced data schema there is now a possibility to also derive fitting database tables from those schema. This tables should be created and filled dynamically and provide the user a fully and easy accessible interface. As an implication of the persistent structured data, the earlier mentioned problem of frequently changing data structures can now be easily solved. The schema can be used to validate those imported and transformed data. By also adding a corresponding visual state to those data configurations, the user will be able to react up on changed data structures.

Keywords: data engineering, schema recommendation, open data

PDF: Master Thesis

Reference: Alexander Mahler. Giving Structure to Open Data in the JValue ODS. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021.

Final Thesis: Implementing an Open Data ETL Processing Engine with Kafka

Abstract: The JValue project group is developing a modeling ecosystem for Extract Transform Load (ETL) processes. Part of this ecosystem is a description model for those. This thesis suggests a conversion process from the description model into an Apache Kafka runtime, described in a cloud-native format, like Docker Compose. The conversion is implemented as a library and done in a multi-phase approach as known from classical compilers. In the first step, the description language is converted into a runtime independent intermediate description and afterward in a description of a concrete runtime, in this case, Kafka. The multi-phase approach minimizes the implementation work for additional runtimes and allows runtime independent optimization and analysis. The goal for the generated runtime is to use existing Kafka components, which is only partially possible due to the complexity of the description model.

Keywords: open data, compiler, Apache Kafka

PDF: Master Thesis

Reference: Fabian Arnold. Implementing an Open Data ETL Processing Engine with Kafka. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2022.

Final Thesis: Elasticity Concept for Microservice-based System

Abstract: Software Elasticity is the concept of adapting available resources to the current or expected workload. This concept fits modern and stateless microservice architectures, which are scalable by design. Their scalability is closely related to Software Resilience and places new demands on cloud architectures. The JValue Open Data Service (JValue ODS) is an open data platform with focus on Extract, Transform, Load (ETL) pipelines and aims to make the usage of open data easy, reliable and safe. For the success of the ODS, an Elastic and therefore Resilient hosting is mandatory. This thesis deploys the ODS to an on-premise Kubernetes cluster to improve the uptime guarantee, discusses different deployment strategies, elaborates horizontal microservice scaling techniques and operates the necessary infrastructure. This thesis presents Peffer’s Design Research Process to build a concept for Elasticity in microservice-based architectures. The concept is demonstrated and evaluated in the context of the JValue ODS.

Keywords: Microservices, elasticity, scalability, kubernetes, devops

PDF: Master Thesis

Reference: Aron Metzig. Elasticity Concept for a Microservice-based System. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2022.

Final Thesis: Testing Microservice Integration with Consumer-Driven Contract Tests

Abstract: Microservice-Systeme bestehen aus eigenständigen, verteilten Services, die über Netzwerkverbindungen miteinander kommunizieren. Das Testen von Service-Integrationen kann bei derartigen Systemen eine Herausforderung darstellen, da hierzu mehrere Services zur selben Zeit ausgeführt werden müssen und es viele potenzielle Quellen für falsch-negative Testergebnisse gibt.

Consumer-Driven Contract Testing (CDCT) ist ein Ansatz, der dazu verwendet werden kann, beide Seiten einer Service-Integration unabhängig voneinander zu testen. Dies wird erreicht, indem die beiden Seiten der Integration mithilfe eines Vertrags (engl. contract) voneinander entkoppelt werden, wobei der Contract als Vermittler fungiert. Dieser wird durch den Service vorgegeben, welcher die Schnittstelle eines anderen Services beansprucht, und drückt dessen Erwartungen an die verwendete Schnittstelle aus.

Diese Arbeit erforscht, inwieweit CDCT zur Testung von Microservice-Integratio- nen beitragen kann, indem Vorteile, Nachteile, Herausforderungen und Richtlinien erfasst werden, die im Zusammenhang zu CDCT für Microservice-Systeme stehen. Für die initiale Theoriebildung wurde zunächst eine strukturierte Literaturrecherche durchgeführt. Im Anschluss wurde Aktionsforschung betrieben, bei der Consumer-Driven Contract Tests für ein Open-Source Microservice-System entwickelt wurden. Zuletzt, nach der abschließenden Evaluation der Aktionsforschung, wurden die Inhalte, die im Rahmen der strukturierten Literaturrecherche erhoben wurden, mit den Erfahrungen aus der Aktionsforschung abgeglichen.

Keywords: Microservices, integration testing, consumer-driven contract tests, pact

PDF: Master Thesis

Reference: Felix Quast. Testing Microservice Integration with Consumer-Driven Contract Tests. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2022.

Final Thesis: Konzept und Implementierung zur Observability für microservicebasierte Anwendungen

Abstract: ‘Microservices’ sind in der heutigen Zeit ein bekanntes und beliebtes Architekturmuster. Viele weltbekannte Tech-Unternehmen haben sich für diese entschieden. Die Entkopplung und die Aufteilung der Aufgaben in kleinere Services bringen neben daraus resultierenden Vorteilen auch Herausforderungen mit sich. Einen zentralen Negativpunkt hinsichtlich der Entwicklung dieser Dienste stellen die erschwerte Fehlersuche sowie die Schwierigkeit dar, den Überblick über die Anwendung als Gesamtes zu behalten.

In dieser Arbeit werden Softwaretools zur Überwachung und zur Aggregation von Log-Informationen vorgestellt. Darüber hinaus wird eine Kombination von Programmen gewählt, um ein Konzept zu entwickeln und eine beispielhafte Implementierung dieser Werkzeuge in ein bereits laufendes Open-Source-Projekt zu präsentieren.

Keywords: Microservices, observability, monitoring

PDF: Bachelor Thesis

Reference: Daniel Fabrikantow. Konzept und Implementierung zur Observability für microservicebasierte Anwendungen. Bachelor Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021.

Job / Abschlussarbeit Model Compilation to Streaming Backends

Wir suchen jemanden kompetent im Compilerbau, der oder die Lust hat, sich eines wichtigen Spezialthemas anzunehmen, nämlich offene Daten nutzbar zu machen. Es folgt eine Aufgabenbeschreibung für eine Abschlussarbeit, aber wir bieten das für alles an: Studentischer Job, Abschlussarbeit, Promotion / Wimi Stelle:

Model Compilation to Streaming Backends

The goal of the thesis is to develop a compiler that turns an ETL pipeline model (the “program”) into a configuration for an event streaming framework (the “target architecture”). To start things easy, we want to compile SQL to Kafka Streams in such a way that an SQL schema definition configures a Kafka Streams instance so that the Kafka instance can load a CSV file and save it into a PostgreSQL database. If this works well, we will increase complexity: Not just a source schema (the “E” in ETL) but also transformation rules (the “T”) and a target schema (the “L”); not just Kafka Streams as a target architecture, but also Spark, Flink, and others.

This thesis is part of the JValue project. The mission of the JValue project is to make open data easy, reliable, and safe to use.

Thesis Description – Model Compilation to Streaming Backend

Bei Interesse gern direkt an mich wenden.

Dirk Riehle

Final Thesis: Value Types in TypeScript for JValue

Abstract: Over the past years, TypeScript has increasingly been gaining popularity due to its nature of providing functionalities to ease the development of scalable and robust applications whilst syntactically being a superset of JavaScript. With the growing complexity of data-driven environments, it is essential for programming languages to cope with value types beyond their primitive data types to capture the semantics of intangible data, such as systems of measurement, thus increasing readability and solidity across the codebase. By creating a test-driven framework in TypeScript, this thesis lays out different methods to efficiently implement value types, discusses their benefits as well as drawbacks, and ensures the reliability of the framework by integrating it into an existing data-driven service.

Keywords: Value types, JValue, TypeScript

PDF: Bachelor Thesis

Reference: Mert Baran. Value Types in Typescript for JValue. Bachelor Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021. 

Final Thesis: Hierarchical Open Data Source Import for the JValue ODS

Abstract: Open Data has become more popular in the last few years due to its value to society. Governments, institutions, companies or individuals can make use of Open Data and add to economic growth or extract new knowledge from publicly available data. The Open Data Service (ODS) is a software developed by the Professorship of Open Source that aims to simplify the consumption of Open Data and make it more reliable.

The goal of this thesis is to extend the functionality of the ODS by the support of hierarchically structured data sources, in particular, File Transfer Protocol (FTP) based data sources. Due to the simplicity and reliability of the FTP, it is an appropriate solution for providing Open Data. This thesis aims to enable the user to explore and configure FTP data sources by developing a new microservice with a proof-of-concept user interface. As a result, consuming Open Data from FTP data sources is simplified and becomes more flexible.

Keywords: Open data, FTP, JValue ODS, microservices

PDF: Master Thesis

Reference: Benjamin Fischer. Hierarchical Open Data Source Import for the JValue ODS. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021.

Final Thesis: Fehlertoleranzanalyse von Microservice basierten Softwarearchitekturen – Konzept und Anwendung am JValue ODS

Abstract: Microservice-based software architectures play an essential role in building sizeable scalable cloud systems. The main advantage of microservices compared to the traditional software monoliths is the independent development, deployment, and scaling of the individual microservices, which allows innovations at a higher speed. Because microservice-based architectures are distributed systems, complexity is shifted from code to the network and communication layer. Therefore, additional failures like service outage or network connectivity loss arise, which must be tolerated to keep the system healthy and running. Within this thesis, a reusable concept is developed to analyse the fault tolerance of microservice-based software architectures. This allows for revealing weaknesses in the architecture that negatively affects the system’s reliability and resilience. For frequent problems, solution proposals are provided. The concept’s applicability and effectiveness are evaluated by applying it at the JValue Open Data Service (ODS). The analysis revealed several issues regarding the ODS’s fault tolerance, which could be fixed with the provided solutions.

Keywords: Microservices, fault tolerance, dependency graph, transactional outbox pattern

PDF: Master Thesis

Reference: Jonas Schüll. Fehlertoleranzanalyse von Microservice basierten Softwarearchitekturen – Konzept und Anwendung am JValue ODS. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2021.

Final Thesis: Design and Implementation of Parameterizable Data Import for the JValue ODS

Abstract: Governments have recognized that the publication of open data is of great economic and social value. Collecting and using this data is challenging because it is not always available in an easy to process format. Minimizing these challenges is the task of the JValue Open Data Service (ODS), a system that makes data consumption easy. Yet the location of a resource and the time of a data import is statically defined.

This thesis presents a concept how the ODS can be extended by parameterizable datasources and how the data import can be triggered manually. This addresses the challenge of rapidly changing data on the Internet and adapts the ODS in order to deal with the emerging problems. With parameterizable datasources it is viable to dynamically describe the location of resources. The possibility for manual data imports ensures that data is only retrieved when it is really needed. The design decisions and the implementation of these functionalities for the ODS are covered in this thesis.

Keywords: open data; etl; JValue ODS; RESTful APIs

PDF: Bachelor Thesis

Reference: Jens Wächtler. Design and Implementation of Parameterizable Data Import for the JValue ODS. Bachelor Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2020.