Trustworthy Human Language Technologies (TrustHLT) is a research group associated with the Professorship of Fairness and Transparency at Ruhr University Bochum and led by Ivan Habernal. TrustHLT started in 2021 as an independent research group at the Technical University of Darmstadt, where some of the group members are affiliated.
I hold a W3 Professorship of Fairness and Transparency at Ruhr University Bochum, Germany, jointly affiliated with the Research Center Trustworthy Data Science and Security. My current research areas include privacy-preserving NLP, legal NLP, and explainable and trustworthy models. My research track spans argument mining and computational argumentation, crowdsourcing, large-scale corpora, serious games, sentiment and sarcasm on social media, and semantic web.
Lena is currently exploring the research area of computational argumentation in the legal domain.
Sebastian' research areas include privacy-preserving NLP with a focus on text rewriting with provable guarantees.
Mahammad explores interpretability in NLP applications focusing on the legal domain.
Erion's research is on privacy-preserving NLP and robust anonymization.
Timour's research areas included privacy-preserving NLP, differential privacy in graph neural networks, and privacy-preserving semantic representations of language.
Jesper's thesis aimed to detect prompt injection attacks on large language models using saliency methods.
Qiankun's research areas included multimodal learning and fact-checking.
Martin's thesis compared privacy-preserving inference methods, applying them to NLP tasks and developing software to connect PyTorch with techniques like homomorphic encryption and garbled circuits.
Marius investigated question answering in the German legal domain. His thesis explored how well existing models can support laymen to receive a first legal aid, based on a created dataset of questions in lay language to answers in legalese. → EACL'24 paper
Chris's thesis focused on finding best practices on how to optimally adapt the concept of differential privacy in NLP environments while putting the needs of the end-users first and considering perceptional biases to make differential privacy more accessible. → LREC-COLING'24 paper
Lijie is a second-year PhD student in Computer Science at King Abdullah University of Science and Technology. Her research interests cover machine learning algorithm on Explainable AI (XAI), Differential Privacy, and Differential Private Natural Language Models. She is also interested in Machine Unlearning, and other security issues in data field. → EACL'24 paper
Sudarshan is an undergraduate student in Computer Science from India. His primary research interest is in creating language processing tools that are socially and ethically responsible. He is working on a research project related to differentially private synthetic data generation.
Nina wrote her thesis on privacy-preserving techniques for crowdsourcing sensitive text data. → Linguistic Annotation Workshop (ACL'23) paper
Johanna studied computer science at TU Darmstadt. In her bachelor thesis, she compiled an easily accessible legal benchmark dataset to enable evaluating models on a variety of legal NLP tasks.
Lars, student of information systems technologies, cooperated with political scientists to identify indoctrination in German history textbooks through entity emotion analysis.
Ying explored privacy-preserving transformer models in the legal domain. Her thesis combined large-scale pre-training with differential privacy and evaluates the trade-off between privacy-preserving capability and downstream performance. → Legal NLP workshop (EMNLP'22) paper
Sarah explored ethical argumentation in scientific literature. Her thesis focused on controversial technologies and automatic mining of absent, shifting, and evolving ethical arguments.
Manuel was a bachelor's student at the TU Darmstadt focusing on machine learning. He wrote his thesis on the effectiveness and impact on accuracy using differential privacy in NLP. → EMNLP'22 paper
Lena studied computer science at TU Darmstadt. In her thesis she dealt with differentially private language representation learning.
Daniel explored legal argument mining in court decisions with focus on ECHR decisions and their art of argumentation in regard to their importance level. → AI & Law journal paper
Fabian's research area included legal argument mining, expert annotations, and low-resource and few-shot transfer learning for annotation recommendations.
TrustHLT has currently the following open positions
PhD student (m/f/x) on Trust and Privacy in Large Language Models, 3 years, full-time, TVL-E13. Read the full job posting.
The Artificial Intelligence & Law journal, a leading outlet for interdisciplinary works in the area of legal NLP, just accepted a paper by Lena Held and Ivan Habernal entitled "LaCour!: Enabling Research on Argumentation in Hearings of the European Court of Human Rights". The preprint is available at arxiv and the datasets can be easily downloaded from huggingface.
EMNLP 2024, the top-tier conference on NLP, accepted two papers on privacy-related topics co-authored by TrustHLT! Granularity is crucial when applying differential privacy to text by Doan Nam Long Vu, Timour Igamberdiev, Ivan Habernal and Private Language Models via Truncated Laplacian Mechanism by Tianhao Huang, Tao Yang, Ivan Habernal, Lijie Hu, Di Wang.
I have been appointed as full professor (W3-Forschungsprofessor, research professorship) at the Faculty of Computer Science, Ruhr-Universität Bochum and the Research Center Trustworthy Data Science and Security.
LREC-COLING 2024, the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation accepts one long paper by Chris Weiss, Frauke Kreuter (LMU Münchnen) and Ivan Habernal on user perception of privacy guarantees in NLP datasets.
The 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024) accepts one long paper by Marius Büttner and Ivan Habernal (main conference) entitled "Answering legal questions from laymen in German civil law system", one demo-paper by Timour Igamberdiev, Doan Nam Long Vu, Felix Künnecke, Zhuo Yu, Jannik Holmer, and Ivan Habernal entitled "DP-NMT: Scalable Differentially-Private Machine Translation", and one EACL Findings paper by Lijie Hu et al., co-authored by Ivan Habernal, entitled "Differentially Private Natural Language Models: Recent Advances and Future Directions".
I'm giving an invited talk at the 2023 Annual Conference of efl – the Data Science Institute in Frankfurt am Main.
I'm starting a new position as W2 Professor of Natural Language Processing at Paderborn university.
The Text, Speech and Dialogue (TSD 2023) conference invited me to the beautiful city of Pilsen, Czech Republic, to give a keynote talk on privacy in NLP.
Timour Igamberdiev successfully defended his dissertation thesis on Differential Privacy in NLP with magna cum laude grade. Timour is the first PhD student graduating from the TrustHLT group!
TrustHLT has two papers on privacy-preserving NLP accepted to the Findings of the Association for Computational Linguistics: ACL 2023, co-authored by Timour Igamberdiev, Cleo Matzken, and Steffen Eger.
We held a tutorial on Privacy-Preserving Natural Language Processing at the The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) in Dubrovin, Croatia. The slides are available at GitHub.
It was my pleasure to give an invited talk about Privacy-Preserving Natural Language Processing at the Aalto University in Helsinki. The video recording should soon become available.
The 17th Conference of the European Chapter of the Association for Computational Linguistics will host our tutorial on Privacy-Preserving Natural Langauge Processing in Dubrovnik, in May 2023.
Our new paper "One size does not fit all: Investigating strategies for differentially-private learning across NLP tasks" by Manuel Senge, Timour Igamberdiev, and myself will be presented at the 2022 Conference on Empirical Methods in Natural Language Processing in Abu Dhabi in December this year.
In this winter term, I'm holding a W2 interim professorship at the The Center for Information and Language Processing at the Ludwig-Maximilians-Universität München.
Our new paper "DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting" by Timour Igamberdiev (TrustHLT), Thomas Arnold (UKP), and myself will be presented at the 29th International Conference on Computational Linguistics in Korea in October this year.
I'm now a member of hessian.AI — The Hessian Center for Artificial Intelligence. Its mission is to drive research excellence, education, practice and leadership in AI to foster economic growth and improve the human condition.
Our paper on protecting privacy of models trained on graph data using differential privacy has been accepted at the International Conference on Language Resources and Evaluation (LREC) to be held in Marseille, France in June.
Our paper analyzing trickiness of differentially-private text representation learning will be presented at the 60th Annual Meeting of the Association for Computational Linguistics, the world's top conference for natural language processing.
I'm giving an invited lecture at the School of Computing and Information Science, University of Maine with a bit provoking title "If all you have is a hammer, everything looks like a nail: SGD-DP in privacy-preserving NLP" (download slides).
Our paper on the pitfalls of differential privacy in NLP will be presented at the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), one of the world's leading conferences for natural language processing.
I'll be giving a guest lecture at the International Summer School on "AI and Criminal Justice" in Rome on July 12th. This summer school is a great opportunity to acquire an interdisciplinary and in-depth knowledge in the cutting-edge area of AI and criminal justice.
I'm happy to volunteer as a mentor for early career researchers at this year's Conference of the European Chapter of the Association for Computational Linguistics (EACL). One of the topics on the agenda is "How to survive grad school", I'm very much looking forward to some fresh perspectives!
Thanks to Yang Gao for invited me over to Royal Holloway, University of London to give an invited talk on privacy-preserving NLP, a joint work with Timour Igamberdiev. Slides available here.
Happy to join the Area Chairs for sentiment analysis and argument mining at this year's Conference on Empirical Methods in Natural Language Processing (EMNLP).
I happily accepted an invitation to join the standing reviewer board of Computational Linguistics, the "longest-running publication devoted exclusively to the computational and mathematical properties of language".
Together with Isabelle Augenstein and tutorial chairs for NAACL, EMNLP, and ACL-IJCNLP, we are preparing the next year's selection of tutorials to be presented either virtually or in-person.
In this interdisciplinary collaboration, we look into argumentation in the verdicts of the European Court of Human Rights. What makes a verdict of a high importance? Is it the facts? Is it the argumentation pattern? Is it the judges? Or is it something left between the lines?
We combine legal expertise with state-of-the-art NLP.
We collaborate with expert legal researchers Prof. Dr. Christoph Burchard from Geothe University Frankfurt.
Chair for German, European and International Criminal Law and Procedure, Comparative Law and Legal Theory
What does is mean for machine translation models to protect privacy? What personal information do neural machine translation systems leak? Can we protect users during inference?
In this research project supported by the Hessisches Ministerium des Innern und für Sport we tackle privacy-preserving natural language processing in the context of machine translation, including differential privacy and cryptographical tools.
The goal of this project is to explore Natural Language Processing methods that can dynamically identify and obfuscate sensitive information in texts, with a focus on implicit attributes, for example, their ethnic background, income range, or personality traits. These methods will help to preserve the privacy of all individuals - both authors and other persons mentioned in the text. Further, we go beyond specific text sources, like social media, and aim to develop robust and highly adaptable methods that can generalize across domains and registers.
We collaborate with the UKP Lab led by Prof. Dr. Iryna Gurevych.
Director of the Ubiquitous Knowledge Processing (UKP) Lab
Slides are freely available at GitHub under open licences.
I like transparency, so here are the official evaluation sheets of my courses (names of 3rd persons were redacted).
Recorded lectures from RUB (winter term 2024/25) are in this YouTube playlist.
Previous recorded lectures from Paderborn (winter term 2023/24) are in this YouTube playlist.
Send me an e-mail
Prof. Dr. Ivan Habernal
Research Center Trustworthy Data Science and Security and
Faculty of Computer Science, Ruhr-Universität Bochum
Office MC 5OG 135, Universitätsstraße 140, D-44799 Bochum, Germany
ivan (dot) habernal (at) ruhr-uni-bochum.de