Foundational Paper

Designing Agent Behavior Before Runtime

The missing discipline between institutional intent and agent implementation

Arash Nourkeyhani · June 2026 · Version 1.0

How can enterprises stand behind their agentic workflows with confidence? How can they demonstrate that an agent's behavior was deliberate, defensible, and appropriate to the consequence it carried?

These questions have been put to me repeatedly in recent months by senior technology and design leaders—not as thought experiments, but as unresolved operational problems. Their teams can orchestrate agents, monitor them, log them, evaluate them, and investigate them after something goes wrong. What they often cannot produce is a durable record of how those agents were meant to behave.

Somewhere in an organization, an agent is about to take an action no one explicitly decided it should take. Not because anyone was careless, but because the behavior was never deliberately designed. It emerged from a system prompt written under deadline and subsequently edited a dozen times, with no durable record of the intent behind it.

Ask after an incident who decided that the agent was permitted to act that way, and the honest answer in many enterprises today is: no one, exactly.

After fifteen years of designing product experiences for complex enterprise environments, including AI- enabled systems in consequential workflows, I have become convinced that this is a distinct design problem.

It is not primarily a question of interface, automation, or intelligence. It is the problem of how an autonomous system should behave when its actions carry institutional consequence.

It exposes a blind spot in enterprise AI and points to a principle that I believe will define the next phase of the discipline:

Intent should never be discovered in production.

The layer between policy and implementation

Much of the industry's attention is concentrated on runtime and post-runtime: orchestration, observability, technical guardrails, audit logs, evaluations, monitoring, and incident review.

All of these are necessary.

None of them, by themselves, establishes what an agent was deliberately intended and permitted to do.

Monitoring an agent after it acts is not the same as deciding how it should act. Evaluating observed behavior is not the same as authoring behavioral intent. An audit log can reconstruct what happened, but it cannot establish whether the action was consistent with a decision the institution actually made beforehand.

Frontier AI laboratories have begun specifying behavior before runtime through model specifications, constitutions, and policy hierarchies. That work matters, but it governs the model in general.

It cannot determine what a contract-review agent, claims agent, tax agent, procurement agent, or financial-services agent is permitted to do inside a particular workflow, under a particular institution's obligations, in front of its clients.

The model provider establishes a behavioral floor. The enterprise must define the workflow-level intent and operating boundaries within the legal, regulatory, professional, and contractual constraints it bears.

Figure 1 — The scope of behavioral authorship

The figure distinguishes model-level behavior, established through model specifications and general policy hierarchies, from workflow-level behavior, which defines what an agent is expected and permitted to do inside a particular institutional workflow.

Governing statement: The model provider establishes the behavioral floor. The enterprise defines the workflow-level intent.

This work sits alongside model policies, agent configurations, technical control specifications, evaluation frameworks, governance structures, monitoring systems, and audit mechanisms.

These mechanisms are complementary, but they perform different functions.

A model specification defines general model behavior. An agent configuration describes how an agent and its workflow are assembled. A technical control specification identifies where and how a policy will be enforced. An evaluation tests observed behavior against a requirement. Monitoring and audit reveal what occurred in operation.

The missing function is upstream of all of them.

It is the deliberate authorship of the workflow objective, consequential decisions, behavioral directives, authority boundaries, escalation logic, refusal conditions, conduct, and accountability allocation that those mechanisms are expected to realize.

The claim is not that no related work exists.

The claim is that workflow-level institutional intent still lacks a consistently owned, cross-functional, implementation-ready artifact in many enterprises.

That is the layer between policy and implementation.

Preparation, execution, and review

Every mature consequential discipline distinguishes preparation from execution and review.

Aviation does not begin with the flight. It begins with the flight plan, fuel calculation, abnormal- procedure review, and crew briefing. The flight is execution under uncertainty. The debrief is where the institution examines what occurred and determines what must change.

Each phase has a different purpose, a different artifact, and a different standard of rigor.

In many agentic AI programs, these phases remain collapsed. Teams write the prompt, connect the tools, deploy the workflow, and rely on runtime to reveal what the behavior actually is.

Figure 2 — Three phases of consequential discipline

The figure compares aviation and agentic AI across preparation, execution, and review. Preparation corresponds to pre-runtime, where intended behavior is deliberately designed before engineering implementation. Execution corresponds to runtime, where the system operates under uncertainty through its models, tools, orchestration, and controls. Review corresponds to post- runtime, where monitoring, evaluation, audit, and incident review determine what occurred and what should change.

Three phases of consequential discipline: Pre-runtime, Runtime, and Post-runtime. Pre-runtime is owned by Anchored Agency. Runtime and Post-runtime are well-attended.
1

Preparation

Flight plan. Fuel calculation. Abnormal-procedure review. Crew briefing.

Owned by Anchored Agency

2

Execution

The flight itself. Execution under uncertainty.

Well Attended

3

Review

Debrief. Where the institution learns.

Well Attended

In agentic AI, the three phases have been collapsed. Anchored Agency restores pre-runtime as a discipline.

Runtime still has an essential role. Agentic systems are probabilistic, and their behavior must be tested against real conditions, technical limitations, and edge cases that no specification can fully anticipate.

That is not the problem.

The problem is using runtime to discover institutional intent.

A flight plan does not eliminate the surprises of flight. It ensures that the crew did not decide the destination in mid-air.

The rigor of preparation should scale across three variables: consequence, reversibility, and autonomy.

The greater the institutional consequence of an action, the more explicit its behavioral requirements should become.

The more difficult the action is to reverse, the stronger its evidence and authorization conditions should be.

The more independently an agent may act, the clearer its authority boundaries, escalation logic, refusal conditions, and accountability allocation must become.

This does not mean attempting to enumerate every possible response or eliminating adaptive behavior.

It means specifying material behavior: the decisions, boundaries, conditions, and forms of conduct where deviation could create institutional consequence.

A low-consequence assistant that produces easily reversible suggestions may require a lightweight behavioral definition. An agent that can move funds, alter contractual positions, shape professional judgment, affect customer rights, create institutional liability, or initiate regulated actions requires substantially greater rigor.

The purpose is not to script every action.

It is to prevent institutional authority from emerging accidentally.

In consequential workflows, behavior is no longer an implementation detail. Behavior is the product.

The questions that moved upstream

In traditional software, design concentrated heavily on screens, flows, states, and interactions.

In agentic systems, the critical design questions move further upstream—to the behavior itself and to the relationships it structures across agent and person, agent and system, and agent and agent.

  • When should the agent act, defer, escalate, or refuse?
  • What authority does it possess, and what authority must it never possess?
  • Which decisions may it make independently?
  • Which decisions require verification or authorization?
  • What evidence must exist before an action is allowed?
  • What happens when required information is missing, contradictory, or unreliable?
  • How should the agent communicate uncertainty?
  • How should it behave after making a mistake?
  • Who verifies the inputs, who authorizes the consequential action, who audits the outcome, and who bears accountability when the system is wrong?

These are not merely interface questions.

They are questions of behavior, authority, conduct, and institutional responsibility.

Many organizations currently answer them implicitly. Engineering writes the system prompt. Product writes the requirements. Design is brought in after core behavior has already been established. Operations and domain teams are consulted unevenly. Risk and legal may review late, after authority and escalation decisions have already been embedded in the system.

That operating model is not adequate for consequential agentic AI.

In these workflows, the system prompt is no longer merely technical instruction. It is one implementation surface for a behavioral contract that should already have been defined in an Agent Behavior Specification.

The prompt is not the contract itself.

Neither is the workflow graph, tool configuration, approval gate, policy engine, or evaluation suite.

They are implementation and verification surfaces through which agreed behavior is realized and tested.

When the underlying behavior is defined informally, late, or by one function alone, the enterprise has deployed an autonomous system whose behavior was never deliberately designed.

Governance, monitoring, evaluation, and audit are necessary. They do not replace the deliberate authorship of agent behavior before implementation.

That work must be cross-functional.

Product, design, engineering, operations, and domain teams should author the intended behavior together, with risk and legal reviewing applicable boundaries where necessary.

No single function owns the discipline.

Product contributes the workflow objective, value model, and intended outcome.

Design contributes expertise in human interaction, conduct, legibility, judgment, recovery, and trust.

Engineering contributes technical feasibility, system architecture, and the means by which behavior can be implemented and tested.

Operations contributes the realities of execution, intervention, exception handling, and organizational capacity.

Domain teams contribute professional knowledge, consequential context, and the standards against which decisions must be judged.

Risk and legal review applicable boundaries. They do not substitute for the cross-functional authorship of workflow behavior.

You cannot stand behind a system you never deliberately shaped.

The pre-runtime phase and its artifact

What remains missing in many enterprises is a named, cross-functional discipline for this work.

The category is pre-runtime agent behavior design: the deliberate specification of how an agent should behave before engineering implementation and before runtime.

A product requirements document will not carry this work by itself.

A system prompt will not carry it.

A governance checklist will not carry it.

The work requires its own canonical output artifact: the Agent Behavior Specification, or ABS.

Within this practice, the ABS is canonical because it is the authoritative record of intended workflow- level behavior.

That does not mean it replaces model specifications, agent configurations, technical control policies, evaluation frameworks, governance records, monitoring systems, or audit evidence.

It defines the institutional intent those mechanisms are expected to implement, test, enforce, or inspect.

The Agent Behavior Specification defines seven things: the workflow objective, consequential decisions, behavioral directives, authority boundaries, escalation logic, refusal conditions, and accountability structure.

The accountability structure is expressed through the Agent Accountability Matrix.

Figure 3 — The Agent Behavior Specification: seven components

Workflow Objective defines what the agent is working to achieve. Consequential Decisions identifies the decisions through which the agent may influence institutional outcomes. Behavioral Directives define how the agent should behave in recurring material conditions. Authority Boundaries define what the agent may and may never do. Escalation Logic defines when and how the agent must transfer a matter. Refusal Conditions define when the agent must decline to proceed. Accountability Structure defines who acts, verifies, authorizes, bears accountability, and audits.

The ABS is not a substitute for organizational accountability, governance, or technical control.

It does not create authority, incentives, professional responsibility, institutional competence, or enforcement.

Its function is narrower, but necessary.

It makes intended behavior and consequential decision allocations explicit before they are embedded in the system. It gives engineering, evaluation, governance, and accountable owners a declared intent against which the implementation can be designed and judged.

What changes in practice

Consider a contract-review agent whose objective is to identify deviations from approved positions and propose redlines.

A conventional requirement might describe the documents the agent receives, the jurisdictions it supports, the source material it can access, and the expected accuracy of its analysis.

The ABS addresses a different set of decisions.

The agent may identify and explain a contractual deviation. It may compare the provision against an approved playbook. It may propose alternative language.

It may not accept contractual language on behalf of the institution.

It may not send a redline to an external counterparty without authorization.

It must escalate specified categories, such as non-standard indemnity, governing-law, limitation-of- liability, or data-transfer provisions.

A named legal owner authorizes external action and bears accountability for that authorization.

These are not abstract policy statements.

They directly shape the technical harness.

External communication tools remain unavailable until approval is recorded. Certain clause categories trigger escalation. The system preserves the source clause, proposed change, applicable behavioral directive, reason for escalation, and human authorization.

The specification has not merely documented the system after it was designed.

It has determined what the system must technically permit, prevent, and make inspectable.

Now consider an accounts-payable agent that validates invoices and prepares payments.

The agent may match an invoice against a purchase order, validate supporting evidence, identify an exception, and queue a payment for review.

It may not release funds when supplier banking details have recently changed.

It may not proceed when required evidence is missing.

It may not release a transaction above an approved boundary.

It must escalate suspicious duplication, unusual changes in payment destination, and exceptions that cannot be resolved from available evidence.

A named finance role authorizes the payment. The institution retains accountability for the operating policy, control environment, and boundaries within which the agent acts.

Again, the ABS does not create accountability.

It makes the allocation explicit and translates it into implementation requirements: restricted tool permissions, approval gates, evidence requirements, exception handling, intervention points, and an audit trail connecting the action to the person or institution authorized to approve it.

In both cases, the ABS does not replace the harness, governance, evaluation, or accountable leadership.

It ensures that those mechanisms do not operate against an unstated or retrospectively reconstructed intention.

The specification and the harness

A practical objection follows immediately: specifications decay.

They become detached from implementation, ignored by teams, or preserved as compliance theater.

Versioning alone does not solve that problem.

The ABS remains useful only when it maintains a direct relationship with implementation and verification.

Every material behavioral directive should map to at least one implementation or verification artifact. That mapping may resolve to a system instruction, tool permission, workflow state, approval gate, control policy, evaluation case, intervention point, evidence requirement, or audit event.

A directive with no implementation or verification mapping is incomplete.

A consequential implementation decision with no originating directive is unauthored behavior.

Traceability must therefore run in both directions.

A team should be able to move from an institutional decision in the ABS to the mechanism that implements and tests it.

It should also be able to move from a consequential implementation choice back to the behavioral directive that authorized it.

That is what prevents the ABS from becoming documentation theater.

A second objection, often raised by engineers, deserves a direct answer.

You cannot fully determine every behavioral detail without considering the proposed technical harness, because behavior and capability are partly entangled.

That is correct.

Pre-runtime does not mean writing a complete specification in isolation from technical architecture.

The ABS should be authored during pre-implementation design alongside the design of the harness: the tools the agent may access, the workflow states through which it moves, the approval gates it requires, the evidence it must retain, the controls it needs, and the interventions the system must support.

Behavioral intent and technical feasibility inform one another during this design phase.

What must not be deferred is the institutional decision.

Before engineering implementation begins, the cross-functional team should have agreed on the workflow objective, consequential decisions, behavioral directives, authority boundaries, escalation logic, refusal conditions, and accountability allocation that the harness will be required to implement.

The ABS defines those behavioral requirements.

The harness design determines how those requirements can be technically realized.

Engineering implements against the Agent Behavior Specification.

An ABS is implementation-ready only when every material directive has a defined implementation or verification mapping.

Writing the document is not the completion criterion.

For a release, completion requires an approved ABS version, implementation mappings for its material directives, and evidence that the proposed controls and behaviors have been tested against those directives.

The ABS establishes intended behavior.

Conformance testing examines whether the implementation realizes it.

Runtime evidence may later reveal circumstances the specification failed to anticipate. That is expected in probabilistic systems.

But runtime discovery must not silently rewrite institutional intent.

When operational evidence exposes a new condition, the affected behavior returns to pre-runtime design. The ABS is revised, reviewed, and approved as a new version before the changed intent is implemented in a subsequent release.

Each release therefore has an inspectable behavioral baseline: the approved ABS version against which its prompts, permissions, workflow states, controls, evaluations, and evidence requirements were implemented.

When behavior changes because of a model update, new tool, modified prompt, revised workflow, or expanded context, the specification provides the basis for determining whether the change is acceptable, material, or inconsistent with approved intent.

Accountability that cannot rest on an agent

Accountability in agentic systems cannot be reduced to a generic responsibility matrix, because the actor is no longer always a person.

Every consequential action requires an explicit allocation of five roles: who acts, who verifies, who authorizes, who bears accountability, and who audits.

Inside the ABS, this allocation is expressed through the Agent Accountability Matrix.

The matrix has one governing rule:

An agent may act, perform verification steps, or support an audit. It may not independently authorize a consequential action or bear accountability. Authorization and accountability must resolve to a named human or institution.

An agent operating within previously approved boundaries is executing an authorized policy.

It is not independently authorizing the boundary or the institutional authority behind it.

Figure 4 — The Agent Accountability Matrix

Acts identifies who performs the action. An agent may hold this role. Verifies identifies who checks the action, evidence, or output. An agent may perform defined verification steps. Authorizes identifies who approves a consequential action or the operating boundary within which it may occur. This must resolve to a named human or institution. Bears Accountability identifies who owns the institutional consequence. This must never resolve to an agent. Audits identifies who reviews what occurred after the fact. An agent may support an audit, but it does not own the audit's institutional judgment.

Agent Accountability Matrix. Five roles: Acts (MAY), Verifies (MAY), Authorizes (NEVER ALONE), Bears Accountability (NEVER), and Audits (MAY SUPPORT). Authorization and accountability must resolve to named humans or institutions.

Acts

MAY

Who performs the action.

Verifies

MAY

Who checks the action, evidence, or output.

Authorizes

NEVER ALONE

Who approves a consequential action or operating boundary.

Bears Accountability

NEVER

Who owns the institutional consequence.

Audits

MAY SUPPORT

Who reviews what occurred after the fact.

The Governing Rule

An agent may act, perform verification steps, or support an audit. It may not independently authorize a consequential action or bear accountability.

The Agent Accountability Matrix does not create accountability.

Organizational authority, professional obligations, governance, incentives, and enforcement do that.

The matrix creates an explicit and inspectable allocation that can be implemented, reviewed, challenged, and audited.

A named accountable owner is meaningful only when the surrounding institution gives that owner the authority, information, time, competence, and means to intervene.

The matrix can expose the absence of those conditions.

It cannot supply them.

Naming a human is therefore not enough.

If the person lacks the necessary context, authority, capacity, or intervention mechanism, accountability becomes ceremonial.

Real oversight sometimes requires deliberate friction: a moment of engagement placed exactly where judgment must occur, so that authorization becomes a considered act rather than a reflexive click.

The amount and placement of that friction should depend on consequence, reversibility, and autonomy.

Not every action requires real-time human approval.

But every consequential action must occur within boundaries that were explicitly authorized, remain inspectable, and resolve to accountable human or institutional authority.

Accountability in multi-agent systems

The governing rule holds in multi-agent systems without exception.

When one agent passes information or a task to another agent, accountability does not transfer between them.

One agent may retrieve information. Another may classify it. A third may verify a condition. A fourth may recommend or execute an action within an approved boundary.

None of them bears accountability.

No matter how long the chain becomes, accountability never comes to rest on an agent.

In high-frequency and multi-agent systems, accountability may resolve to the named humans or institutions that approved the operating boundaries, controls, policies, and conditions under which the agents act.

That does not mean every action requires real-time human authorization.

It means the system must preserve a defensible relationship between the action taken and the people or institutions that authorized the boundaries within which it was permitted.

Figure 5 — Accountability in multi-agent systems

The figure shows a chain of agents performing distinct functions. Agent A receives or retrieves the relevant information. Agent B interprets or classifies it. Agent C verifies the applicable conditions or evidence. Agent D recommends or executes an action within an approved boundary. The governing statement beneath the chain reads: Accountability does not transfer along the agent chain. The chain resolves to named humans and institutions that approved the boundaries within which the agents operate.

The figure must not suggest that accountability belongs to an abstract system, model, agent collective, or technical architecture.

The system may distribute action.

It cannot distribute accountability into nonexistence.

Conduct

There is one dimension of behavior the old operating model leaves almost entirely unowned: conduct.

Behavior is what the agent does.

Conduct is how it carries itself while doing it.

Conduct determines how the agent defers, how it escalates, how it communicates uncertainty, how it makes the limits of its authority legible, how it asks for help, and how it behaves when it is wrong.

Trust in these systems is asymmetric.

It develops slowly and can collapse after a single confident error.

How an agent responds after a mistake may matter as much as how it performs when it is correct.

  • Does it acknowledge uncertainty?
  • Does it explain what evidence is missing?
  • Does it stop when its authority ends?
  • Does it distinguish fact from inference?
  • Does it make escalation visible?
  • Does it preserve the person's ability to intervene?
  • Does it reveal that a prior conclusion may no longer be reliable?
  • Does it recover without pretending the failure never occurred?

Conduct is not a decorative layer applied on top of behavior.

It is a dimension of behavior and one of the dimensions most consequential to trust.

Design has a distinctive contribution to make because it has long shaped how systems communicate choices, limits, uncertainty, recovery, and consequence to people.

But conduct cannot belong to design alone, any more than it can belong only to engineering, product, operations, legal, risk, or domain teams.

It must become a shared pre-runtime discipline.

Conduct should therefore be specified in the ABS wherever it materially affects a person's ability to understand the agent's authority, evaluate its output, intervene in its action, or recover from its failure.

A directive such as "communicate uncertainty" is too vague to implement.

The specification must identify the material conditions under which uncertainty becomes relevant, what the agent must disclose, what it must not imply, when it must stop, and when it must escalate.

This is where pre-runtime agent behavior design moves beyond policy language.

It translates institutional expectations of conduct into implementable behavioral requirements.

Anchored Agency

This is the layer I have been working to define.

Anchored Agency is the practice of pre-runtime agent behavior design for consequential and fiduciary- grade workflows. Its canonical output artifact is the Agent Behavior Specification, which defines intended agent behavior before engineering implementation and before runtime.

The Agent Accountability Matrix sits inside the specification, making the allocation of action, verification, authorization, accountability, and audit explicit.

Product, design, engineering, operations, and domain teams author the intended behavior together, with risk and legal reviewing applicable boundaries.

Engineering implements against the approved Agent Behavior Specification.

Anchored Agency does not replace governance, technical controls, evaluation, monitoring, or audit.

It gives those mechanisms an explicit behavioral intent to implement, test, enforce, and inspect.

Governance establishes authority and enforcement.

The harness implements controls.

Evaluation tests conformance.

Monitoring and audit provide operational evidence.

The ABS performs the upstream function none of them can perform retrospectively: it records what the system was deliberately intended and permitted to do before that behavior was implemented.

This distinction matters.

A runtime control can prevent an agent from calling a tool under a defined condition.

It cannot decide, by itself, whether the institution should permit that action.

An evaluation can reveal that an agent failed to escalate.

It cannot determine, by itself, which conditions should require escalation.

A monitoring system can show that an agent acted repeatedly within a particular boundary.

It cannot establish whether that boundary was ever deliberately authorized.

The pre-runtime decision must exist before these mechanisms can implement or evaluate it.

That is the role of the Agent Behavior Specification.

The purpose of Anchored Agency is not to produce more documentation.

It is to ensure that consequential agent behavior is deliberately authored, allocated, made implementation-ready, and preserved as institutional intent before engineering turns it into an operating system.

The premise can be stated in one sentence:

You can only stand behind an autonomous system whose behavior you explicitly designed.

Author
Arash Nourkeyhani

Published
June 2026 · Version 1.0

Citation
Nourkeyhani, Arash. "Designing Agent Behavior Before Runtime: The missing discipline between institutional intent and agent implementation." Anchored Agency, June 2026.

The views expressed are the author's own.