Workflow guards and approval steps are control points that keep AI agents from taking unsafe, unauthorized, or low-quality actions. They define what an agent is allowed to do, when a workflow must stop, and when a human or policy system must approve the next step.
These controls matter most when agents use tools, access private data, update systems, send messages, or make recommendations that affect real users.
Short Answer
Workflow guards are checks that enforce rules before, during, or after an agent step. Approval steps are pauses in the workflow where a human, policy engine, or trusted service must approve an action before it executes.
Together, they help agents operate safely by:
- blocking unsafe inputs
- limiting tool access
- validating outputs
- enforcing permissions
- routing risky actions to review
- preventing invalid state transitions
- recording decisions for audit
- supporting rollback or compensation when needed
Why Guards Are Needed
AI agents can plan, choose tools, retrieve context, and act across systems. That flexibility is useful, but it also creates risk.
An agent may misunderstand a request, retrieve the wrong context, choose an inappropriate tool, produce an unsafe output, or continue a workflow after conditions have changed.
Workflow guards turn broad autonomy into bounded autonomy.
Guards vs Evals vs Approvals
These terms are related, but they do different jobs.
Guards enforce rules. They block, redact, constrain, or route a workflow.
Evals measure quality, safety, factuality, policy fit, or task success. Their results may feed guards or approval decisions.
Approvals pause the workflow until a human or trusted service authorizes the next action.
A strong agent workflow usually uses all three.
Pre-Model Guards
Pre-model guards run before the user request or workflow state reaches the model.
They can:
- reject prohibited requests
- detect prompt injection attempts
- redact sensitive data
- check user permissions
- classify task risk
- limit allowed tools for the run
- route requests to deterministic workflows
Pre-model guards reduce wasted work and prevent unsafe context from entering the agent loop.
Tool Guards
Tool guards control what tools an agent can use and how it can use them.
Important tool guard patterns include:
- read/write tool separation
- least-privilege tool scopes
- schema validation for tool inputs
- permission checks at execution time
- rate limits
- domain allowlists
- dry-run mode for risky operations
- approval before state-changing actions
The agent can propose a tool call, but the system should still validate whether that call is allowed.
Post-Model Guards
Post-model guards run after the model generates an output but before the output is shown to a user or used by another system.
They can check for:
- format errors
- missing citations
- unsupported claims
- policy violations
- PII leakage
- unsafe instructions
- brand or tone issues
- invalid tool arguments
A post-model guard may allow the output, reject it, request a retry, send it to human review, or fall back to a safer response.
Workflow State Guards
State guards prevent invalid workflow transitions.
For example, an agent should not move from drafting directly to executed if the workflow requires approval. It should not retry a canceled workflow. It should not call a write tool after a deadline has expired.
State guards make agent workflows predictable even when the model proposes an invalid next step.
Approval Steps
An approval step pauses the workflow until an authorized reviewer or policy system approves the next action.
Approval steps are useful before:
- sending external messages
- changing customer records
- executing financial actions
- publishing content
- changing configuration
- deleting data
- using sensitive tools
- escalating or closing cases
The workflow should store the approval request, reviewer, decision, timestamp, and any modifications.
Risk-Based Approval
Not every action needs human review.
Use risk-based approval so low-risk actions continue automatically while high-risk actions pause.
Risk factors may include:
- external visibility
- financial impact
- data sensitivity
- irreversibility
- customer impact
- confidence score
- policy ambiguity
- missing evidence
- unusual user behavior
This keeps workflows efficient without giving agents unnecessary freedom.
Policy Checks
Policy checks encode business rules, compliance rules, and operational constraints.
Examples:
- Do not refund above a threshold without approval.
- Do not expose customer data outside the requester's permission scope.
- Do not send legal or medical advice without review.
- Do not execute a deployment during a freeze window.
- Do not recommend an action without supporting evidence.
Policy checks should be explicit services or rules where possible, not hidden inside a prompt.
Evidence Requirements
Some workflow guards should require evidence before an agent can continue.
For example, a support agent may need a cited policy source before answering a billing question. An incident agent may need logs and deployment records before recommending a rollback. A compliance agent may need the relevant clause before flagging a contract risk.
Evidence guards reduce unsupported decisions.
Approval UI Requirements
A reviewer should see enough context to make a decision.
An approval interface should show:
- the original request
- the agent's proposed action
- supporting evidence
- risk level
- policy checks
- tool inputs
- expected side effects
- rollback or cancellation options
Approval should not be a blind yes-or-no button.
Audit Records
Approvals and guard decisions should be durable.
Store:
- which guard ran
- what it checked
- input summary
- decision
- reason
- reviewer identity when applicable
- timestamp
- workflow state before and after
This supports debugging, compliance, and incident review.
Guard Outcomes
A guard should return a structured outcome.
Common outcomes include:
- allow
- block
- redact
- retry_with_feedback
- route_to_human
- require_more_evidence
- require_approval
- cancel_workflow
Structured outcomes are easier to orchestrate than free-form explanations.
Behavior Shaping
Some guards trigger corrective loops instead of immediately blocking.
For example, an evaluator may detect that an answer lacks citations. The workflow can retry the generation step with feedback that citations are required. If the second attempt still fails, the workflow can route to human review or return a safer fallback.
Correction loops should always be bounded.
Rollback and Compensation
Approval steps reduce risk, but they do not remove it.
When an agent performs a state-changing action, the workflow should know whether the action can be rolled back or compensated.
Examples:
- restore a prior configuration
- reopen a ticket
- send a correction message
- cancel a scheduled action
- create a compensating transaction
High-risk actions should include recovery planning before approval.
Circuit Breakers
A circuit breaker blocks a class of actions when the system detects repeated failures or unsafe conditions.
For example, if a tool is returning bad data, a dependency is unhealthy, or a guard is failing at a high rate, the workflow can stop automatic execution and require review.
Circuit breakers prevent agents from amplifying operational incidents.
Security Considerations
Guards should not rely only on model obedience.
Enforce security outside the model with identity checks, permission checks, scoped credentials, tenant isolation, secret redaction, and data access filters.
Also treat retrieved content as untrusted input. External documents, emails, tickets, and web pages may contain instructions designed to manipulate the agent.
Observability
Guard and approval behavior should be visible in traces.
Track:
- guard name
- guard version
- decision
- policy reason
- approval status
- state transition
- retry count
- final outcome
This makes it easier to tune policies and investigate failures.
Example: Support Reply Approval
A support agent drafts a refund reply.
The workflow may apply these guards:
- Check whether the agent can access the customer account.
- Retrieve the refund policy and require citation.
- Validate that the refund amount is below the automatic threshold.
- Redact unnecessary payment details.
- Require manager approval if the amount is high.
- Record the final approval before sending.
The agent writes the draft, but the workflow controls execution.
Example: Configuration Change Approval
An operations agent proposes a configuration change.
The workflow may require:
- dry-run output
- affected service list
- rollback plan
- maintenance window check
- approval from an on-call engineer
- post-change validation
This is safer than allowing the model to directly execute the change.
Common Mistakes
- Putting all guard logic inside the prompt.
- Allowing agents to approve their own high-risk actions.
- Using human approval without showing enough evidence.
- Skipping permission checks on retries.
- Not recording guard decisions.
- Failing to separate read tools from write tools.
- Routing every minor action to humans and slowing the workflow unnecessarily.
- Not testing blocked, rejected, and escalated paths.
Design Checklist
- Define risk levels for workflow actions.
- Add pre-model, tool, post-model, and state guards.
- Use explicit policy checks outside the prompt.
- Require approval for high-impact or irreversible actions.
- Show reviewers evidence, proposed action, side effects, and rollback options.
- Store guard and approval decisions in durable state.
- Use structured guard outcomes.
- Trace guard decisions for observability.
- Test blocked, approval, retry, rollback, and cancellation paths.
Summary
Workflow guards and approval steps make AI agents safer by placing enforceable boundaries around autonomous behavior.
Guards block, constrain, validate, redact, or route workflow steps. Approval steps pause risky actions until a human or trusted policy system authorizes them. Together with durable state, observability, and rollback planning, they let agents act usefully without turning every decision over to the model.