Orchestration and Verification of AI Agents: Designing Human Handoffs in Agentic AI Systems

  • Motivation

    Recent advances in large language models and agentic AI systems are transforming knowledge work. In many settings, humans may no longer primarily read, extract, and process information themselves, but instead orchestrate AI agents, verify generated outputs, and intervene when necessary. Prior work suggests that this shift represents a new mode of work in which complexity is not eliminated, but redistributed from direct task execution toward supervision, coordination, and verification of AI-generated results. 

    At the same time, this development raises important open questions: How should tasks be distributed between humans and AI agents? Under which conditions should an AI system continue autonomously, and when should it delegate a task back to a human? How can humans effectively verify AI outputs and remain meaningfully in control? These questions are becoming increasingly relevant as organizations begin to experiment with agentic AI systems in knowledge-intensive and high-stakes work contexts.

    Background

    Emerging research on AI-supported knowledge work indicates that human work is shifting from a read-and-extract mode toward an orchestrate-and-verify mode. Humans may increasingly coordinate AI-based systems, assess their outputs, and selectively intervene in cases of uncertainty, ambiguity, or risk. 

    However, this shift should not be understood purely as progress. It may also create new challenges, such as shallow oversight, overreliance on AI-generated outputs, deskilling, unclear accountability, and new forms of cognitive burden. Therefore, the topic requires both constructive and critical investigation: how should such systems be designed, and what are their limits?

    Goal

    The goal of this thesis is to investigate how human handoffs in agentic AI systems can be understood, designed, and evaluated. The thesis should examine how humans orchestrate and verify AI agents and under which conditions AI systems should delegate tasks back to humans.

    The exact focus can be adapted depending on the student’s interests. In particular, the thesis can follow one of the following directions:

     

    Possible Thesis Options

    1. Design Science Research: Tool Support for Orchestration and Verification: This option investigates how a tool or interface could support humans in orchestrating and verifying AI agents. Possible steps include expert interviews, derivation of design requirements, prototype development, and evaluation. Example questions include: What information should a system provide to support human verification? How can handoffs, uncertainty, and overrides be communicated in a useful way?
    2. Delegation Logic: When Should AI Hand Tasks Back to Humans? This option focuses on the decision logic of human handoffs. The thesis could study how AI agents should decide when to continue autonomously and when to escalate or delegate a task back to a human. This may be explored in a benchmark setting, through conceptual modeling, or from a decision-theoretic or game-theoretic perspective. Example questions include: Should delegation depend only on uncertainty, or also on task criticality, explainability, or human workload?
    3. Structured Literature Review: AI Delegation, Orchestration, and Verification: This option systematically reviews the literature on AI task delegation, AI orchestration, verification of AI outputs, and human oversight. The goal could be to derive a taxonomy, framework, or research agenda on human handoffs in agentic AI systems.

     

    Expected Contribution

    Depending on the chosen direction, the thesis is expected to contribute by developing:

    • a prototype for human orchestration and verification of AI agents,
    • a conceptual model or formal perspective on AI-to-human delegation,
    • a taxonomy or framework for human handoffs in agentic AI systems, or
    • design requirements and research gaps for future human–AI collaboration.

    Critical Perspective

    A strong thesis should not only examine how orchestration and verification can be enabled, but also critically assess their implications. In particular, the thesis may investigate whether this new work mode genuinely supports humans or whether it risks shifting complexity, responsibility, and decision pressure in problematic ways.

    Methodological Approaches

    Possible methods include:

    • qualitative interviews,
    • case study research,
    • design science research,
    • prototype development and evaluation,
    • benchmark-based experimentation,
    • conceptual or game-theoretic modeling,
    • structured or systematic literature review.

    Requirements / Recommended Profile

    The topic is suitable for students with an interest in one or more of the following areas:

    • human–AI collaboration,
    • information systems and digital work,
    • AI agents and multi-agent systems,
    • human oversight and verification,
    • design science research,
    • conceptual modeling or formal analysis.

    Prior experience with qualitative research, prototyping, or AI-related topics is helpful, but not strictly required, depending on the selected thesis option.

    Contact

    If you are interested, please send a current transcript of records, a short CV, and a brief motivation (2–3 sentences) to moritz.diener@kit.edu

    Students interested in this topic are welcome to reach out to discuss possible focus areas and methodological fit.

    Litertaure 

    • Tomašev, Franklin, & Osindero (2026), Intelligent AI Delegation https://arxiv.org/abs/2602.11865
    • Amershi et al. (2019), Guidelines for Human-AI Interaction https://dl.acm.org/doi/10.1145/3290605.3300233
    • Shneiderman (2020), Human-Centered Artificial Intelligence: Three Fresh Ideas  https://doi.org/10.17705/1thci.00131
    • Kudina & van de Poel (2024), A Sociotechnical System Perspective on AI  https://link.springer.com/article/10.1007/s11023-024-09680-2
    • Krakowski (2025), Human-AI Agency in the Age of Generative AI  https://www.sciencedirect.com/science/article/pii/S1471772725000065
    • Rabanser et al. (2026), Towards a Science of AI Agent Reliability https://arxiv.org/abs/2602.16666
    • Zhang et al. (2026), Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution  https://arxiv.org/abs/2603.11445