Navigationsweiche Anfang

Navigationsweiche Ende

Select language

Chair for Technologies and Management of Digital Transformation

Univ. Prof. Dr. Ing. Tobias Meisen

Industrial Natural Language Processing and Information Extraction

Today, companies generate huge amounts of heterogeneous structured, semi-structured and unstructured data. While the structured and semi-structured data can be analyzed and processed very well due to their given structure, the unstructured data always poses new challenges to companies. Of all things, unstructured data makes up about 80% of all company-wide data. Among other things, this data includes

  • Text documents
  • Images
  • Videos
  • E-Mails
  • Speech data (audio recordings, chats, etc.)
  • PDFs

With our research focus "Industrial Natural Language Processing & Information Extraction", we pursue the goal of making this data findable and usable and to analyse this data subsequently. In addition to the application of a variety of algorithms and methods of machine learning and artificial intelligence for the extraction and analysis of existing data, our focus lies on the development and enhancement of current methods in the following two areas.

In the context of information extraction, we focus our research especially to the development of methods for the targeted analysis of text and image documents. This enables us to recognize and extract tables, diagrams and other elements for PDF documents.

In the context of Industrial Language Processing, we focus on the development and design of effective micro service-based NLP architectures for the efficient analysis of large amounts of data and on the anonymization and pseudonymization of personal information within text documents.



André Pomp, M.Sc.

Interested in this research?

Would you like to write a thesis in this research area? Then look here for open topics or contact us at pomp{at}

Would you like to delve deeper into this field? Then join our team as a research assistant! Further information here.

Selected relevant publications

Matthias Hansen; André Pomp; Kemal Erki; Tobias Meisen
Data-Driven Recognition and Extraction of PDF Document Elements
Technologies, 7(3):65
ISSN: 2227-7080
Elma Kerz; Andreas Burgdorf; Daniel Wiechmann; Stefan Meeger; Yu Qiao; Christian Kohlschein; Tobias Meisen
Understanding Vocabulary Growth Through An Adaptive Language Learning System
Proceedings of the 8th Workshop on NLP for Computer Assisted Language Learning , page 65--78.
Publisher: LiU Electronic Press, Turku, Finland
Robert Jungnickel; André Pomp; Andreas Kirmse; Xiang LI; Vladimir Samsonov; Tobias Meisen
Evaluation and Comparison of Cross-lingual Text Processing Pipelines
2019 IEEE Symposium Series on Computational Intelligence (SSCI) , page 417--425.
Export as: