Designing Organizational Harnesses for LLM Agents: Integrating Company Knowledge, Rules, and Norms
- Type:Bachelor's thesis / Master's thesis
- Date:Immediately
- Supervisor:
-
Motivation
Large language models are increasingly used as agents that do not only generate text, but also retrieve information, use tools, and act within organizational processes. In practice, however, these systems rarely work well based on the foundation model alone. Their behavior depends heavily on the surrounding harness: the runtime layer that provides relevant company information, enforces boundaries, structures memory, and governs how the model interacts with tools and data. Recent work and product developments increasingly describe this surrounding layer as a central factor for making agents useful and reliable in real-world settings. OpenAI, for example, now explicitly describes its updated Agents SDK as providing a model-native harness with configurable memory, sandbox-aware orchestration, and support for files, tools, and custom instruction structures such as AGENTS.md.
This development is especially relevant in enterprise settings. Companies want LLM-based systems to act in line with internal policies, domain-specific knowledge, compliance requirements, and organizational norms. Yet many current approaches rely on relatively simple prompt files, markdown instructions, or manually maintained knowledge snippets. Such approaches can be useful, but they raise important questions: How do companies actually design these harnesses in practice? What kinds of organizational knowledge, rules, and norms do they encode? Where are the limitations of static instruction files? And how can more structured technical harnesses help models stay within desired boundaries while still remaining useful and flexible?
Background
A growing body of recent work suggests that LLM agents are increasingly built through externalization: instead of changing model weights, developers move important capabilities into surrounding infrastructure such as memory, skills, protocols, and harness logic. In parallel, new protocols such as the Model Context Protocol (MCP) aim to standardize how AI systems connect to external tools and data sources, and recent agent platforms emphasize traces, tools, context management, and controlled execution environments as first-class building blocks. Together, these developments suggest that the future of enterprise AI may depend less on prompts alone and more on how organizations engineer the runtime around the model.
From an information systems perspective, this is an important and underexplored topic. The harness is not merely technical glue code. It is the place where organizational knowledge, control, governance, and work design become operationalized. This makes the topic relevant both for empirical research on how companies currently design agent systems and for design-oriented work on how such systems should be built.
Goal
The goal of this thesis is to investigate how organizations can design model harnesses that provide LLM-based systems with relevant company information while also encoding rules, norms, and boundaries for appropriate behavior.
The exact focus can be adapted depending on the student’s interests. In particular, the thesis can follow one of the following directions:
Possible Thesis Options
1. Interview Study: How Companies Design Organizational Harnesses for LLMs
This option investigates how companies currently design the runtime around LLM-based systems to include organizational knowledge, rules, and norms. The thesis could examine questions such as: What kinds of information are included in a model harness? How are business rules, compliance constraints, or organizational norms represented? How do practitioners balance flexibility and control? What are common problems and trade-offs in current approaches?Possible steps include:
- literature review on LLM agents, harness engineering, and enterprise AI governance,
- identification of relevant organizations or practitioners,
- development of an interview guide,
- qualitative interviews and coding,
- derivation of design patterns, challenges, or a taxonomy of organizational harness practices.
A possible outcome could be a structured framework showing how companies currently design organizational harnesses, what elements they encode, and where current approaches fall short.
2. Design Science / Prototype Development: A Structured Model Harness Beyond Markdown Files
This option focuses on the design and implementation of a more structured technical harness that goes beyond static markdown files or ad hoc prompt instructions. The goal is to develop and evaluate an artifact that gives an LLM access to relevant company information while also communicating rules, norms, and behavioral boundaries in a more explicit and maintainable way.Possible directions include:
- designing a representation for company rules, norms, and contextual information,
- creating a structured retrieval or constraint layer for LLM-based systems,
- implementing a prototype harness that combines context provision with boundary-setting,
- evaluating the approach in benchmark tasks or selected enterprise-like scenarios.
Example questions include: How can company rules be represented in a machine-usable but flexible format? How should organizational norms be distinguished from hard constraints? How can the harness decide what the model is allowed to see, say, or do? And how can such a harness improve controllability compared to simple markdown-based instructions?
A possible outcome could be a prototype and evaluation showing how structured harness design can improve reliability, consistency, and organizational fit of LLM systems.
Expected Contribution
Depending on the chosen direction, the thesis is expected to contribute by developing:
- an empirical understanding of how companies currently design model harnesses for enterprise AI,
- a taxonomy of organizational harness elements such as company information, hard rules, soft norms, permissions, and escalation logic,
- design requirements for enterprise-ready harnesses,
- a prototype for a structured harness beyond static markdown or prompt files, or
- insights into how organizational control and flexibility can be balanced in LLM-based systems.
A strong thesis should not only investigate how such harnesses can be built, but also critically reflect on their limitations. For example, the thesis may assess whether organizational harnesses genuinely improve reliability and governance, or whether they simply shift complexity into brittle external structures that are difficult to maintain.
Methodological Approaches
Possible methods include:
- qualitative interviews,
- multiple-case study research,
- thematic coding / qualitative content analysis,
- design science research,
- prototype development and evaluation,
- benchmark-based experimentation,
- conceptual modeling,
- comparative evaluation of different harness designs.
Requirements / Recommended Profile
The topic is suitable for students with an interest in one or more of the following areas:
- LLM agents and agentic AI,
- enterprise AI and information systems,
- human-AI collaboration and governance,
- knowledge management and organizational rules,
- qualitative empirical research,
- design science research and prototyping.
For the interview-based direction, prior experience with qualitative methods is helpful. For the technical direction, prior experience with Python, LLM application development, retrieval systems, or agent frameworks is beneficial. Depending on the selected thesis option, either a stronger empirical or a stronger technical profile may be a good fit.
Contact
If you are interested, please send a current transcript of records, a short CV, and a brief motivation (2–3 sentences) to moritz.diener@kit.edu.
Students interested in this topic are welcome to reach out to discuss possible focus areas and methodological fit.
Literature
Suggested starting points include:
- Zhou et al. (2026), Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
- OpenAI (2026), The Next Evolution of the Agents SDK
- OpenAI, Agents SDK Documentation
- Anthropic (2025), Donating the Model Context Protocol and establishing the Agentic AI Foundation
- Lou et al. (2026), AutoHarness: Improving LLM Agents by Automatically Synthesizing a Code Harness
- Hevner et al. (2004), Design Science in Information Systems Research
