GeMTeX
German Medical Text Corpus
Contact
Project partners
- Charité – University Hospital Berlin
- ID GmbH & Co. KGaA
- Technical University of Darmstadt
- Dresden University of Technology
- University Hospital Erlangen
- University Hospital Essen
- Averbis GmbH
- Heidelberg University Hospital
- German National Library of Medicine (ZB MED)
- Leipzig University
- University of Leipzig Medical Center
- Ludwig Maximilian University of Munich
- Technical University of Munich
- University of Münster
- Hasso Plattner Institute for Digital Engineering gGmbH
- Tübingen University Hospital
- Medical University of Graz (Associated Partner)
Funding
The GeMTeX Project is funded by the Federal Ministry of Education and Research within the national ”Medical Informatics Initiative" with approx. 6.8 million euros, of which approx. 200,000 euros have been made available to the MHH (promotional referrence: 01ZZ2314J).
Summary
In everyday clinical practice, numerous texts are produced, such as doctors' letters and reports, which contain valuable information about the development, course, and treatment of a disease. These texts could be used by natural language processing (NLP) tools to assist doctors and researchers in their work. However, the full potential of clinical documents cannot be realised due to a lack of standardisation. The GeMTeX (German Medical Text Corpus) methodology platform aims to fill this gap and make medical texts from patient care available for research projects. The goal is to create the largest medical text corpus in the German language.
Hannover Medical School is focussing on the processing of molecular-pathological findings, which contain a number of particular technical terms, bioinformatical relationships and special terminologies.
Duration
2023-2026