Response Lifecycle

When a customer sends a message, what happens inside the engine before a response emerges? This page explains the preparation iteration loop—the heart of Parlant's response generation.

Why Iterative Preparation?

The Problem: Single-Pass Matching Misses Context

Consider a banking agent with these guidelines:

# Guideline 1
condition="Customer asks about their account balance"
action="Call get_balance() and tell them their balance"

# Guideline 2
condition="Customer's balance exceeds $10,000"
action="Recommend our premium investment options"

When a customer asks "How much money do I have?", Guideline 1 clearly matches. But should Guideline 2 match?

At the moment of matching, the balance is unknown because get_balance() has not yet been called. A single-pass approach would fail to identify Guideline 2, even though it becomes applicable once the system retrieves a balance of $15,000.

The Solution: Loop Until Stable

Parlant addresses this through iteration: the system matches guidelines, executes tools, and then determines whether the new information triggers additional guidelines. This process repeats until reaching a stable state where no new tool calls are required.

This iterative approach enables guidelines to depend on information that does not exist until runtime—such as tool results, API responses, and database lookups—without requiring special configuration.

The Preparation Loop

Here's what happens when the engine processes a customer message:

ALGORITHM: Response Preparation

INPUT: customer_message, session_context
OUTPUT: response_state ready for message generation

1. INITIALIZE response_state:
   - Load context_variables (customer, tag, global scopes)
   - Load glossary_terms (semantic similarity to conversation)
   - Load capabilities (agent abilities relevant to context)

2. iteration = 0

3. WHILE NOT prepared_to_respond AND iteration < max_iterations:

   a. MATCH guidelines:
      - Predict relevant journeys (Top-K by relevance)
      - Prune guidelines to high-probability candidates
      - Batch by category (observational, actionable, etc.)
      - Evaluate batches in parallel
      - Resolve relationships (entailment, suppression, priority)

   b. IDENTIFY tool-enabled guidelines
      - Separate matched guidelines into:
        - ordinary_guidelines (direct actions)
        - tool_enabled_guidelines (require tool calls)

   c. IF tool_enabled_guidelines exist:
      - INFER which tools to call and with what parameters
      - EVALUATE each call: NEEDS_TO_RUN, DATA_ALREADY_IN_CONTEXT, CANNOT_RUN
      - EXECUTE tools that need to run
      - CAPTURE results and insights (missing data, blocked tools)
      - RELOAD glossary_terms (new context may need new terms)

   d. CHECK if prepared:
      - No new tool calls were generated, AND
      - No pending journey transitions

   e. iteration += 1

4. RETURN response_state to message generation

Entry and Exit Conditions

Entry: The loop begins after the engine acknowledges the incoming message and initializes context. At this point, prepared_to_respond is false.

Exit: The loop exits when one of these conditions is met:

Stable state: No tool-enabled guidelines matched, or all matched tools returned DATA_ALREADY_IN_CONTEXT
Max iterations reached: The agent.max_engine_iterations limit was hit (safety valve)

Example: Three-Iteration Response

Each iteration builds upon the context established by previous iterations. The engine proceeds to message generation only after gathering all information necessary for a complete response.

ResponseState: What Accumulates

The ResponseState structure tracks everything collected across iterations:

Field	Description
`context_variables`	Customer, tag, and global variables loaded at start
`glossary_terms`	Domain terms relevant to current conversation
`capabilities`	Agent abilities semantically matched to context
`iterations`	List of completed iteration states
`ordinary_guideline_matches`	Non-tool-enabled guidelines to follow
`tool_enabled_guideline_matches`	Guidelines with associated tools
`journeys`	Active journey definitions
`journey_paths`	Current position in each active journey
`tool_events`	All executed tool calls and their results
`tool_insights`	Information about blocked or failed tools
`prepared_to_respond`	Flag indicating readiness
`message_events`	Generated messages (staged before emission)

State Flow

Each iteration adds to the accumulated state rather than replacing it. Journey paths are updated based on matched journey nodes, tool results accumulate across iterations, and glossary terms may expand when new tool results introduce relevant terminology.

The EngineContext

While ResponseState holds mutable state for a single response, EngineContext is the container that threads through the entire pipeline:

EngineContext
├── info (original request context)
├── logger, tracer, meter (observability)
├── agent (loaded agent configuration)
├── customer (loaded customer data)
├── session (session state and events)
├── interaction (event history)
├── state (ResponseState - mutable)
├── session_event_emitter (for session-level events)
└── response_event_emitter (for response-level events)

Hooks receive the EngineContext and can:

Read any accumulated state
Modify ResponseState fields
Emit custom events
Signal cancellation

See Engine Extensions for hook implementation details.

Why This Design?

Alternatives Considered

Single-pass matching: This approach matches all guidelines once, calls all tools once, and then generates a response. However, it proves inadequate because guidelines often depend on information that only becomes available after tool calls execute.

Parallel-all: This approach matches guidelines and calls tools in parallel, then merges the results. However, the merge logic becomes intractable when determining which tool results affect which guidelines.

Reactive/event-driven: This approach allows tool results to trigger new guideline evaluations dynamically. While theoretically appealing, it introduces complexity without clear benefits; sequential iteration is simpler and equally capable.

Tradeoffs

Dimension	Impact
Latency	Additional iterations increase response time, though this is mitigated by parallel batching within each iteration.
Accuracy	The iterative approach identifies guidelines that a single-pass approach would miss.
Cost	Each iteration incurs LLM calls, though this is mitigated by re-evaluating only affected guidelines.
Predictability	The `max_engine_iterations` parameter provides a configurable safety bound.

The max_iterations Parameter

The max_engine_iterations agent parameter (default: 3) controls the maximum loop depth:

Lower values (1-2): These produce faster responses but may miss some dynamically triggered guidelines.
Higher values (4-5): These enable more thorough preparation at the cost of higher latency and resource consumption.
Typical production: Values of 2-3 iterations handle most real-world scenarios effectively.

If the loop reaches the maximum iteration count without achieving a stable state, the engine proceeds to message generation using the context accumulated up to that point. Tool insights will contain information about any blocked or incomplete tool chains.

What's Next

Now that you understand the iteration loop, explore what happens inside each component:

Guideline Matching: How guidelines are categorized, batched, and matched
Tool Calling: How tools are inferred, evaluated, and executed
Message Generation: How accumulated context becomes a response

Why Iterative Preparation?​

The Problem: Single-Pass Matching Misses Context​

The Solution: Loop Until Stable​

The Preparation Loop​

Entry and Exit Conditions​

Example: Three-Iteration Response​

ResponseState: What Accumulates​

State Flow​

The EngineContext​

Why This Design?​

Alternatives Considered​

Tradeoffs​

The max_iterations Parameter​

What's Next​