PIs
Prof. Dr. Alexander Mehler
Alexander Mehler is Professor of Computational Humanities/Text Technology at Goethe University Frankfurt, where he heads the Text Technology Lab (TTLab). He has served on the executive committee of the German Society for Computational Linguistics and Language Technology, chairing its research group on Quantitative Corpus Linguistics. He also led the research group on Computational Semiotics of the German Society of Semiotics and served on the executive committee of the LOEWE Priority Program Digital Humanities. In addition, he was a member of the executive committee of the Center for Digital Research in the Humanities, Social Sciences and Education Sciences (CEDIFOR). He is a founding member of the German Society for Network Research (DGNet) and a Programme Committee member of the DFG-funded SPP New Data Spaces for the Social Sciences. His research interests include the quantitative analysis, simulative synthesis, and formal modeling of textual units in spoken and written communication. This work encompasses the study of linguistic networks in contemporary and historical languages, informed by models of language evolution. A current focus of his research concerns 4D text technologies involving Virtual Reality (VR) and multimodal computing. Alexander Mehler’s role in the CRC NegLab involves modeling negation phenomena using AI methods based on text, social network, and behavioral data from XR applications.
Contact
Fachbereich für Informatik und Mathematik
Goethe-Universität Frankfurt am Main
Robert-Mayer-Straße 10
Raum 403
D-60325 Frankfurt am Main
- +49 69-798-28921
- amehler(AT)em.uni-frankfurt.de
Publications
2026
Hammerla, Leon; Mehler, Alexander
Gutenberg+: A More Temporally Faithful Corpus for Diachronic NLP Proceedings Article
In: Proceedings Workshop on Structured Linguistic Data and Evaluation (SLiDE 2026), co-located with the Language Resources and Evaluation Conference (LREC 2026), Palma de Mallorca (Spain), 2026, (accepted).
@inproceedings{Hammerla:Mehler:2026:a,
title = {Gutenberg+: A More Temporally Faithful Corpus for Diachronic NLP},
author = {Leon Hammerla and Alexander Mehler},
year = {2026},
date = {2026-01-01},
booktitle = {Proceedings Workshop on Structured Linguistic Data and Evaluation
(SLiDE 2026), co-located with the Language Resources and Evaluation
Conference (LREC 2026)},
address = {Palma de Mallorca (Spain)},
note = {accepted},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Abusaleh, Ali; Hammerla, Leon; Mehler, Alexander
Learning to Detect Cross-Modal Negation: An Analysis of Latent Representations and an Attention-Based Solution Proceedings Article
In: 2026 8th International Conference on Natural Language Processing (ICNLP), Xi'an,China, 2026, (accepted).
@inproceedings{Abusaleh:et:al:2026,
title = {Learning to Detect Cross-Modal Negation: An Analysis of Latent
Representations and an Attention-Based Solution},
author = {Ali Abusaleh and Leon Hammerla and Alexander Mehler},
year = {2026},
date = {2026-01-01},
booktitle = {2026 8th International Conference on Natural Language Processing (ICNLP)},
address = {Xi'an,China},
abstract = {Detecting high-level semantic concepts like negation across modalities
remains a challenge for current multimodal systems. We analyze
this as a fundamental representation learning problem, providing
the first evidence that negation does not form a linearly or non-linearly
separable class in the latent spaces of standard vision-language
models (VLMs). We demonstrate that pretrained embeddings primarily
encode modality-specific features, lacking a generalizable negation
signal. To overcome this, we propose a novel cross-modal attention
architecture that explicitly models inter-modal dependencies,
achieving performance gains of up to +7.03% F1 over unimodal baselines.
Our analysis reveals a key asymmetry: while textual negation often
appears independently, visual negation is semantically dependent
on linguistic context, a finding validated through our statistical
analysis of 3,222 political video-text pairs automatically annotated
via Qwen2.5-VL. By combining this analysis with self-supervised
video representations (JEPA2), we advance the modeling of temporal
negation. This work provides new methods and insights for learning
robust, semantically-aligned representations in multimodal systems.},
note = {accepted},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
remains a challenge for current multimodal systems. We analyze
this as a fundamental representation learning problem, providing
the first evidence that negation does not form a linearly or non-linearly
separable class in the latent spaces of standard vision-language
models (VLMs). We demonstrate that pretrained embeddings primarily
encode modality-specific features, lacking a generalizable negation
signal. To overcome this, we propose a novel cross-modal attention
architecture that explicitly models inter-modal dependencies,
achieving performance gains of up to +7.03% F1 over unimodal baselines.
Our analysis reveals a key asymmetry: while textual negation often
appears independently, visual negation is semantically dependent
on linguistic context, a finding validated through our statistical
analysis of 3,222 political video-text pairs automatically annotated
via Qwen2.5-VL. By combining this analysis with self-supervised
video representations (JEPA2), we advance the modeling of temporal
negation. This work provides new methods and insights for learning
robust, semantically-aligned representations in multimodal systems.
Borkowski, Cedric; Abrami, Giuseppe; Terefe, Dawit; Baumartz, Daniel; Mehler, Alexander
DUUIgateway: A Web Service for Platform-independent, Ubiquitous Big Data NLP Journal Article
In: SoftwareX, vol. 34, pp. 102549, 2026, ISSN: 2352-7110.
@article{Borkowski:et:al:2026,
title = {DUUIgateway: A Web Service for Platform-independent, Ubiquitous Big Data NLP},
author = {Cedric Borkowski and Giuseppe Abrami and Dawit Terefe and Daniel Baumartz and Alexander Mehler},
url = {https://www.sciencedirect.com/science/article/pii/S2352711026000439},
doi = {https://doi.org/10.1016/j.softx.2026.102549},
issn = {2352-7110},
year = {2026},
date = {2026-01-01},
journal = {SoftwareX},
volume = {34},
pages = {102549},
abstract = {Distributed processing of unstructured text data is a challenge
in the rapidly changing and evolving natural language processing
(NLP) landscape. This landscape is characterized by heterogeneous
systems, models, and formats, and especially by the increasing
influence of AI systems. While many of these systems handle text
data, there are also unified systems that process multiple input
and output formats, while allowing for distributed corpus processing.
However, there are hardly any user-friendly interfaces that allow
existing NLP frameworks to be used flexibly and extended in a
user-controlled manner. Due to this gap and the increasing importance
of NLP for various scientific disciplines, there has been a demand
for a web and API based flexible software solution for deploying,
managing and monitoring NLP systems. Such a solution is provided
by Docker Unified UIMA-gateway. We introduce DUUIgateway and evaluate
its API and user-driven approach to encapsulation. We also describe
how these features improve the usability and accessibility of
the NLP framework DUUI. We illustrate DUUIgateway in the field
of process modeling in higher education and show how it closes
the latter gap in NLP by making a variety of systems for processing
text and multimodal data accessible to non-experts.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
in the rapidly changing and evolving natural language processing
(NLP) landscape. This landscape is characterized by heterogeneous
systems, models, and formats, and especially by the increasing
influence of AI systems. While many of these systems handle text
data, there are also unified systems that process multiple input
and output formats, while allowing for distributed corpus processing.
However, there are hardly any user-friendly interfaces that allow
existing NLP frameworks to be used flexibly and extended in a
user-controlled manner. Due to this gap and the increasing importance
of NLP for various scientific disciplines, there has been a demand
for a web and API based flexible software solution for deploying,
managing and monitoring NLP systems. Such a solution is provided
by Docker Unified UIMA-gateway. We introduce DUUIgateway and evaluate
its API and user-driven approach to encapsulation. We also describe
how these features improve the usability and accessibility of
the NLP framework DUUI. We illustrate DUUIgateway in the field
of process modeling in higher education and show how it closes
the latter gap in NLP by making a variety of systems for processing
text and multimodal data accessible to non-experts.
Lücking, Andy; Hammerla, Leon; Mehler, Alexander
Not every quantifier can be negated Proceedings Article Forthcoming
In: Proceedings of textitSinn und Bedeutung, Special Session ``Philosophical and Linguistic Approaches to Negation (PhilLingNeg)'', Frankfurt am Main, Forthcoming, (accepted).
@inproceedings{Luecking:Hammerla:Mehler:2026,
title = {Not every quantifier can be negated},
author = {Andy Lücking and Leon Hammerla and Alexander Mehler},
year = {2026},
date = {2026-01-01},
booktitle = {Proceedings of textitSinn und Bedeutung, Special Session ``Philosophical
and Linguistic Approaches to Negation (PhilLingNeg)''},
address = {Frankfurt am Main},
series = {SuB'30},
note = {accepted},
keywords = {},
pubstate = {forthcoming},
tppubtype = {inproceedings}
}
Hammerla, Leon; Mehler, Alexander
Negation in Reasoning Traces: Interpretable Signals of Correctness and Provenance Proceedings Article
In: Proceedings of the 6th Workshop on Natural Logic Meets Machine Learning (NALOMA), Prague (Czech Republic), 2026, (accepted).
@inproceedings{Hammerla:Mehler:2026:b,
title = {Negation in Reasoning Traces: Interpretable Signals of Correctness
and Provenance},
author = {Leon Hammerla and Alexander Mehler},
year = {2026},
date = {2026-01-01},
booktitle = {Proceedings of the 6th Workshop on Natural Logic Meets Machine Learning (NALOMA)},
address = {Prague (Czech Republic)},
note = {accepted},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2025
Hammerla, Leon; Lücking, Andy; Reinert, Carolin; Mehler, Alexander
D-Neg: Syntax-Aware Graph Reasoning for Negation Detection Proceedings Article
In: Inui, Kentaro; Sakti, Sakriani; Wang, Haofen; Wong, Derek F.; Bhattacharyya, Pushpak; Banerjee, Biplab; Ekbal, Asif; Chakraborty, Tanmoy; Singh, Dhirendra Pratap (Ed.): Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pp. 1432–1454, The Asian Federation of Natural Language Processing and The Association for Computational Linguistics, Mumbai, India, 2025, ISBN: 979-8-89176-303-6.
@inproceedings{Hammerla:et:al:2025b,
title = {D-Neg: Syntax-Aware Graph Reasoning for Negation Detection},
author = {Leon Hammerla and Andy Lücking and Carolin Reinert and Alexander Mehler},
editor = {Kentaro Inui and Sakriani Sakti and Haofen Wang and Derek F. Wong and Pushpak Bhattacharyya and Biplab Banerjee and Asif Ekbal and Tanmoy Chakraborty and Dhirendra Pratap Singh},
url = {https://aclanthology.org/2025.findings-ijcnlp.89/},
isbn = {979-8-89176-303-6},
year = {2025},
date = {2025-12-01},
booktitle = {Proceedings of the 14th International Joint Conference on Natural
Language Processing and the 4th Conference of the Asia-Pacific
Chapter of the Association for Computational Linguistics},
pages = {1432–1454},
publisher = {The Asian Federation of Natural Language Processing and The Association for Computational Linguistics},
address = {Mumbai, India},
abstract = {Despite the communicative importance of negation, its detection
remains challenging. Previous approaches perform poorly in out-of-domain
scenarios, and progress outside of English has been slow due to
a lack of resources and robust models. To address this gap, we
present D-Neg: a syntax-aware graph reasoning model based on a
transformer that incorporates syntactic embeddings by attention-gating.
D-Neg uses graph attention to represent syntactic structures,
emulating the effectiveness of rule-based dependency approaches
for negation detection. We train D-Neg using 7 English resources
and their translations into 10 languages, all aligned at the annotation
level. We conduct an evaluation of all these datasets in in-domain
and out-of-domain settings. Our work represents a significant
advance in negation detection, enabling more effective cross-lingual
research.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
remains challenging. Previous approaches perform poorly in out-of-domain
scenarios, and progress outside of English has been slow due to
a lack of resources and robust models. To address this gap, we
present D-Neg: a syntax-aware graph reasoning model based on a
transformer that incorporates syntactic embeddings by attention-gating.
D-Neg uses graph attention to represent syntactic structures,
emulating the effectiveness of rule-based dependency approaches
for negation detection. We train D-Neg using 7 English resources
and their translations into 10 languages, all aligned at the annotation
level. We conduct an evaluation of all these datasets in in-domain
and out-of-domain settings. Our work represents a significant
advance in negation detection, enabling more effective cross-lingual
research.
Hammerla, Leon; Mehler, Alexander; Abrami, Giuseppe
Standardizing Heterogeneous Corpora with DUUR: A Dual Data- and Process-Oriented Approach to Enhancing NLP Pipeline Integration Proceedings Article
In: Inui, Kentaro; Sakti, Sakriani; Wang, Haofen; Wong, Derek F.; Bhattacharyya, Pushpak; Banerjee, Biplab; Ekbal, Asif; Chakraborty, Tanmoy; Singh, Dhirendra Pratap (Ed.): Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pp. 1410–1425, The Asian Federation of Natural Language Processing and The Association for Computational Linguistics, Mumbai, India, 2025, ISBN: 979-8-89176-303-6.
@inproceedings{Hammerla:et:al:2025a,
title = {Standardizing Heterogeneous Corpora with DUUR: A Dual Data-
and Process-Oriented Approach to Enhancing NLP Pipeline Integration},
author = {Leon Hammerla and Alexander Mehler and Giuseppe Abrami},
editor = {Kentaro Inui and Sakriani Sakti and Haofen Wang and Derek F. Wong and Pushpak Bhattacharyya and Biplab Banerjee and Asif Ekbal and Tanmoy Chakraborty and Dhirendra Pratap Singh},
url = {https://aclanthology.org/2025.findings-ijcnlp.87/},
isbn = {979-8-89176-303-6},
year = {2025},
date = {2025-12-01},
booktitle = {Proceedings of the 14th International Joint Conference on Natural
Language Processing and the 4th Conference of the Asia-Pacific
Chapter of the Association for Computational Linguistics},
pages = {1410–1425},
publisher = {The Asian Federation of Natural Language Processing and The Association for Computational Linguistics},
address = {Mumbai, India},
abstract = {Despite their success, LLMs are too computationally expensive
to replace task- or domain-specific NLP systems. However, the
variety of corpus formats makes reusing these systems difficult.
This underscores the importance of maintaining an interoperable
NLP landscape. We address this challenge by pursuing two objectives:
standardizing corpus formats and enabling massively parallel corpus
processing. We present a unified conversion framework embedded
in a massively parallel, microservice-based, programming language-independent
NLP architecture designed for modularity and extensibility. It
allows for the integration of external NLP conversion tools and
supports the addition of new components that meet basic compatibility
requirements. To evaluate our dual data- and process-oriented
approach to standardization, we (1) benchmark its efficiency in
terms of processing speed and memory usage, (2) demonstrate the
benefits of standardized corpus formats for NLP downstream tasks,
and (3) illustrate the advantages of incorporating custom formats
into a corpus format ecosystem.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
to replace task- or domain-specific NLP systems. However, the
variety of corpus formats makes reusing these systems difficult.
This underscores the importance of maintaining an interoperable
NLP landscape. We address this challenge by pursuing two objectives:
standardizing corpus formats and enabling massively parallel corpus
processing. We present a unified conversion framework embedded
in a massively parallel, microservice-based, programming language-independent
NLP architecture designed for modularity and extensibility. It
allows for the integration of external NLP conversion tools and
supports the addition of new components that meet basic compatibility
requirements. To evaluate our dual data- and process-oriented
approach to standardization, we (1) benchmark its efficiency in
terms of processing speed and memory usage, (2) demonstrate the
benefits of standardized corpus formats for NLP downstream tasks,
and (3) illustrate the advantages of incorporating custom formats
into a corpus format ecosystem.
Bundan, Daniel; Abrami, Giuseppe; Mehler, Alexander
Multimodal Docker Unified UIMA Interface: New Horizons for Distributed Microservice-Oriented Processing of Corpora using UIMA Proceedings Article
In: Wartena, Christian; Heid, Ulrich (Ed.): Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Long and Short Papers, pp. 257–268, HsH Applied Academics, Hildesheim, Germany, 2025.
@inproceedings{Bundan:Abrami:Mehler:2025,
title = {Multimodal Docker Unified UIMA Interface: New Horizons for Distributed
Microservice-Oriented Processing of Corpora using UIMA},
author = {Daniel Bundan and Giuseppe Abrami and Alexander Mehler},
editor = {Christian Wartena and Ulrich Heid},
url = {https://aclanthology.org/2025.konvens-1.22/},
year = {2025},
date = {2025-01-01},
booktitle = {Proceedings of the 21st Conference on Natural Language Processing
(KONVENS 2025): Long and Short Papers},
pages = {257–268},
publisher = {HsH Applied Academics},
address = {Hildesheim, Germany},
series = {KONVENS '25},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Abrami, Giuseppe; Genios, Markos; Fitzermann, Filip; Baumartz, Daniel; Mehler, Alexander
Docker Unified UIMA Interface: New perspectives for NLP on big data Journal Article
In: SoftwareX, vol. 29, pp. 102033, 2025, ISSN: 2352-7110.
@article{Abrami:et:al:2025:a,
title = {Docker Unified UIMA Interface: New perspectives for NLP on big data},
author = {Giuseppe Abrami and Markos Genios and Filip Fitzermann and Daniel Baumartz and Alexander Mehler},
url = {https://www.sciencedirect.com/science/article/pii/S2352711024004047},
doi = {https://doi.org/10.1016/j.softx.2024.102033},
issn = {2352-7110},
year = {2025},
date = {2025-01-01},
journal = {SoftwareX},
volume = {29},
pages = {102033},
abstract = {Processing large amounts of natural language text using machine
learning-based models is becoming important in many disciplines.
This demand is being met by a variety of approaches, resulting
in the heterogeneous deployment of separate, partly incompatible,
not natively scalable applications. To overcome the technological
bottleneck involved, we have developed Docker Unified UIMA Interface,
a system for the standardized, parallel, platform-independent,
distributed and microservices-based solution for processing large
and extensive text corpora with any NLP method. We present DUUI
as a framework that enables automated orchestration of GPU-based
NLP processes beyond the existing Docker Swarm cluster variant,
and in addition to the adaptation to new runtime environments
such as Kubernetes. Therefore, a new driver for DUUI is introduced,
which enables the lightweight orchestration of DUUI processes
within a Kubernetes environment in a scalable setup. In this way,
the paper opens up novel text-technological perspectives for existing
practices in disciplines that deal with the scientific analysis
of large amounts of data based on NLP.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
learning-based models is becoming important in many disciplines.
This demand is being met by a variety of approaches, resulting
in the heterogeneous deployment of separate, partly incompatible,
not natively scalable applications. To overcome the technological
bottleneck involved, we have developed Docker Unified UIMA Interface,
a system for the standardized, parallel, platform-independent,
distributed and microservices-based solution for processing large
and extensive text corpora with any NLP method. We present DUUI
as a framework that enables automated orchestration of GPU-based
NLP processes beyond the existing Docker Swarm cluster variant,
and in addition to the adaptation to new runtime environments
such as Kubernetes. Therefore, a new driver for DUUI is introduced,
which enables the lightweight orchestration of DUUI processes
within a Kubernetes environment in a scalable setup. In this way,
the paper opens up novel text-technological perspectives for existing
practices in disciplines that deal with the scientific analysis
of large amounts of data based on NLP.
Bahmanian, Nasimeh; Bruera, Mercedes Martinez; Lücking, Andy; Hammerla, Leon; Abrami, Giuseppe; Sailer, Manfred; Mehler, Alexander; Lago, Sol
Data management protocol for CRC 1629 Technical Report
CRC 1629 NegLaB - INF no. 1, 2025.
@techreport{Bahmanian:et:al:2025,
title = {Data management protocol for CRC 1629},
author = {Nasimeh Bahmanian and Mercedes Martinez Bruera and Andy Lücking and Leon Hammerla and Giuseppe Abrami and Manfred Sailer and Alexander Mehler and Sol Lago},
url = {https://next.hessenbox.de/index.php/s/zQYBAfeXTJSDaib},
year = {2025},
date = {2025-01-01},
number = {1},
institution = {CRC 1629 NegLaB - INF},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}