Dissertations & Thesis Topics

​​​​Are you interested in writing a doctoral dissertation or thesis with us?

1) Doctoral Dissertation

Please contact us — a message to Prof. Toralf Kirsten is sufficient and will be answered as soon as possible.

What can you expect?

We analyse healthcare data on specific clinical issues in collaboration with departments and clinics of Leipzig University Medical Center. You are welcome to contribute your own research questions and contacts. 

The questions we investigate may relate to structured data, such as those arising from clinical documentation or generated by measuring devices (e.g. laboratory data). We also work with imaging data (CT, MRI, etc.) and text documents such as diagnostic reports. All data is available electronically (no manual transcription required) and is prepared and processed using various methods. If additional data collection is required as part of an observational study, we also provide organisational and technical consulting.

We are particularly interested in questions that contribute to early diagnostics and can ideally be integrated into the AMPEL system​. Further areas of focus include image processing in close collaboration with radiology and other departments, as well as the automatic generation of diagnostic reports using novel approaches (e.g. large language models such as ChatGPT).

We are happy to support you with our expertise.

What do we expect from you?

The success of your doctoral thesis depends largely on your own commitment. A genuine interest in the research question and the motivation to pursue and answer it are therefore essential. We do not expect prior knowledge of natural sciences or engineering. Familiarity with data structures, algorithms, or advanced methods in statistics and artificial intelligence is not required — though it is certainly an asset. Basic statistical knowledge is helpful, and you should be comfortable working with computers.

We will provide training in all relevant methods, including an introduction to common data science tools and the programming languages R and Python, and will support you throughout the research process.​


2) Thesis Topics

​​​​The Medical Data Science group currently offers the thesis topics listed below. ​Please feel free to email us if you have your own idea or questions.

Medical Data Analytics

Data Analytics Methods for Medical Data

Description:  Medical data analytics aims to extract meaningful insights from healthcare data to support clinical decision-making and medical research. However, medical data are often heterogeneous, incomplete, and subject to strict privacy regulations, which makes their analysis challenging. Modern analytical techniques such as statistical modeling and machine learning can help discover patterns and build predictive models from such data.

Goal: The goal of this topic area is to investigate and apply data analytics methods to medical datasets, and to evaluate how different analytical techniques can support prediction, classification, or pattern discovery in healthcare data.

Topics:

  • ​Predicting the resources required for clinical care based on initial admission data from the emergency department
  • Anomaly detection in ICU data: When is the alarm relevant?
  • Detection of breast cancer in MRI: Overcoming the density differences

Large Language Models

Applications of Large Language Models in the Medical Domain

Description:  Healthcare systems generate large amounts of textual information, including clinical notes, discharge summaries, medical reports, and scientific literature. Large language models (LLMs) can support the analysis and interpretation of such data by enabling tasks such as information extraction, summarization, question answering, or clinical decision support. However, challenges remain regarding model reliability, domain adaptation, and privacy considerations in medical applications.

Goal: The goal of this topic is to investigate the use of LLMs for specific tasks in the medical domain and evaluate their effectiveness in the healthcare domain.

Topics:
  • AI agents for information extraction on unstructured patient files
  • Associating patients to ongoing research studies: Matching non-structured inclusion and exclusion criteria to unstructured patient files
  • Theory and practice in medical care giving: Medical guideline adherence - what’s the difference between standard medical care and practical doing
  • Automatic triage support based on vital and symptoms in the Emergency Department
  • A chatbot for automatic triage making
  • Transcribing and summarizing audio streams into meaningful medical notes (BA)​

Synthetic Data Generation

Synthetic Data Generation in the Medical Domain using Generative Models

Description: Synthetic data has gained increasing attention in the medical domain due to challenges related to data accessibility and privacy protection. In cases such as rare diseases, only small datasets are available for robust analytical models and evaluation. Synthetic data generation therefore aims to address these challenges by creating artificial datasets that preserve the statistical properties of real medical data while avoiding direct links to individual patients.

Goal: The goal of this topic is to investigate and implement methods for generating synthetic medical data, for example using deep learning generative models such as generative adversarial networks (GANs) and variational autoencoders (VAEs) or non-deep learning approaches such as adversarial random forests (ARFs). The aim is to evaluate how well synthetic data can reflect the characteristics of real data and how they can support data analysis and machine learning tasks while preserving patient privacy.

Topics:
  • ​Replacing the usual care (default) arm of a clinical trial by synthetically generated data​

Infrastructure

Infrastructure for Medical Data Processing and Analysis

​Description: Data-driven medical research requires reliable infrastructure for data storage, processing, and analysis. Efficient data pipelines and scalable computing environments are necessary to handle large datasets and enable reproducible research workflows while ensuring compliance with data protection requirements.

Goal: The goal is to design or evaluate infrastructure solutions for medical data processing, such as data pipelines, computing environments, or platforms for machine learning workflows, and to investigate how such systems can support efficient and reproducible data analysis in the medical domain.

Topics:
  • FHIR-based patient triage system: Concept & implementation​
  • Transcribing and summarizing audio streams into meaningful medical notes (BA)
Liebigstraße 18, Haus B
04103 Leipzig
Telefon:
+49 341 - 97 10283
Map