Background
Large Language Models (LLMs) such as ChatGPT, Claude, and Gemini are increasingly used in both professional and everyday contexts. Their answers often appear fluent and confident—but they are not always correct. In fact, LLMs can produce hallucinations: plausible-sounding but factually wrong outputs. This raises a critical challenge for using AI systems in high-stakes or decision-support scenarios: How can we trust the reliability of what LLMs generate?
One promising direction is uncertainty quantification. By estimating how “sure” a model is about its own answer, we can detect hallucinations early and improve the calibration between model confidence and correctness. Beyond the technical side, communicating uncertainty effectively is equally important: end-users need to understand when they can rely on an answer—and when caution is warranted. In the end, the goal is to enable users make better-informed decisions.
This thesis offers a unique opportunity to explore uncertainty quantification at different levels. Possible directions range from technical advances (e.g., improving uncertainty estimation algorithms or calibration methods) to applications in reasoning chains (detecting uncertainty across multi-step answers) and human-centered perspectives (designing interfaces that communicate uncertainty and studying its impact on user trust and decision-making).
In this thesis you will:
-
Investigate state-of-the-art approaches for uncertainty quantification in AI and LLMs
-
Experiment with methods for hallucination detection and model calibration
-
Explore applications such as reasoning chains or uncertainty communication in user interfaces
-
Evaluate your approaches through experiments with LLMs and/or user studies
Research impact
Research impact
Your findings will contribute directly to safer and more transparent AI systems:
- Hallucination detection: Identify when LLMs generate unreliable outputs
- Calibration improvement: Align model confidence with actual correctness
- User empowerment: Provide users with interpretable signals of uncertainty to support decision-making
- Trustworthy AI: Advance methods that make AI more reliable in real-world applications
We are looking for candidates who:
-
Passionate about trustworthy and responsible AI
-
Technical foundation: Experience with Python and machine learning
-
Self-motivated researcher: Able to work independently while contributing creative ideas
-
Strong communication: Excellent English skills for writing and presenting your research
Details
Start: October 2025 (or later)
Duration: 6 months
Language: English
How to Apply
We offer you a cutting-edge research topic at the intersection of AI safety, model calibration, and human-AI interaction, close mentorship from experienced researchers, and access to modern LLMs and computational resources. You’ll gain both theoretical insights and practical skills that are highly valued in today’s AI-driven job market.
Ready to advance the future of trustworthy AI?
Please send your current transcript of records, a short CV, and a brief motivation (3–4 sentences) to: