Response Lifecycle
When a customer sends a message, what happens inside the engine before a response emerges? This page explains the preparation iteration loopβthe heart of Parlant's response generation.
Why Iterative Preparation?β
The Problem: Single-Pass Matching Misses Contextβ
Consider a banking agent with these guidelines:
# Guideline 1
condition="Customer asks about their account balance"
action="Call get_balance() and tell them their balance"
# Guideline 2
condition="Customer's balance exceeds $10,000"
action="Recommend our premium investment options"
When a customer asks "How much money do I have?", Guideline 1 clearly matches. But should Guideline 2 match?
At the moment of matching, the balance is unknown because get_balance() has not yet been called. A single-pass approach would fail to identify Guideline 2, even though it becomes applicable once the system retrieves a balance of $15,000.
The Solution: Loop Until Stableβ
Parlant addresses this through iteration: the system matches guidelines, executes tools, and then determines whether the new information triggers additional guidelines. This process repeats until reaching a stable state where no new tool calls are required.
This iterative approach enables guidelines to depend on information that does not exist until runtimeβsuch as tool results, API responses, and database lookupsβwithout requiring special configuration.
The Preparation Loopβ
Here's what happens when the engine processes a customer message:
ALGORITHM: Response Preparation
INPUT: customer_message, session_context
OUTPUT: response_state ready for message generation
1. INITIALIZE response_state:
- Load context_variables (customer, tag, global scopes)
- Load glossary_terms (semantic similarity to conversation)
- Load capabilities (agent abilities relevant to context)
2. iteration = 0
3. WHILE NOT prepared_to_respond AND iteration < max_iterations:
a. MATCH guidelines:
- Predict relevant journeys (Top-K by relevance)
- Prune guidelines to high-probability candidates
- Batch by category (observational, actionable, etc.)
- Evaluate batches in parallel
- Resolve relationships (entailment, suppression, priority)
b. IDENTIFY tool-enabled guidelines
- Separate matched guidelines into:
- ordinary_guidelines (direct actions)
- tool_enabled_guidelines (require tool calls)
c. IF tool_enabled_guidelines exist:
- INFER which tools to call and with what parameters
- EVALUATE each call: NEEDS_TO_RUN, DATA_ALREADY_IN_CONTEXT, CANNOT_RUN
- EXECUTE tools that need to run
- CAPTURE results and insights (missing data, blocked tools)
- RELOAD glossary_terms (new context may need new terms)
d. CHECK if prepared:
- No new tool calls were generated, AND
- No pending journey transitions
e. iteration += 1
4. RETURN response_state to message generation
Entry and Exit Conditionsβ
Entry: The loop begins after the engine acknowledges the incoming message and initializes context. At this point, prepared_to_respond is false.
Exit: The loop exits when one of these conditions is met:
- Stable state: No tool-enabled guidelines matched, or all matched tools returned
DATA_ALREADY_IN_CONTEXT - Max iterations reached: The
agent.max_engine_iterationslimit was hit (safety valve)
Example: Three-Iteration Responseβ
Each iteration builds upon the context established by previous iterations. The engine proceeds to message generation only after gathering all information necessary for a complete response.
ResponseState: What Accumulatesβ
The ResponseState structure tracks everything collected across iterations:
| Field | Description |
|---|---|
context_variables | Customer, tag, and global variables loaded at start |
glossary_terms | Domain terms relevant to current conversation |
capabilities | Agent abilities semantically matched to context |
iterations | List of completed iteration states |
ordinary_guideline_matches | Non-tool-enabled guidelines to follow |
tool_enabled_guideline_matches | Guidelines with associated tools |
journeys | Active journey definitions |
journey_paths | Current position in each active journey |
tool_events | All executed tool calls and their results |
tool_insights | Information about blocked or failed tools |
prepared_to_respond | Flag indicating readiness |
message_events | Generated messages (staged before emission) |
State Flowβ
Each iteration adds to the accumulated state rather than replacing it. Journey paths are updated based on matched journey nodes, tool results accumulate across iterations, and glossary terms may expand when new tool results introduce relevant terminology.
The EngineContextβ
While ResponseState holds mutable state for a single response, EngineContext is the container that threads through the entire pipeline:
EngineContext
βββ info (original request context)
βββ logger, tracer, meter (observability)
βββ agent (loaded agent configuration)
βββ customer (loaded customer data)
βββ session (session state and events)
βββ interaction (event history)
βββ state (ResponseState - mutable)
βββ session_event_emitter (for session-level events)
βββ response_event_emitter (for response-level events)
Hooks receive the EngineContext and can:
- Read any accumulated state
- Modify
ResponseStatefields - Emit custom events
- Signal cancellation
See Engine Extensions for hook implementation details.
Why This Design?β
Alternatives Consideredβ
Single-pass matching: This approach matches all guidelines once, calls all tools once, and then generates a response. However, it proves inadequate because guidelines often depend on information that only becomes available after tool calls execute.
Parallel-all: This approach matches guidelines and calls tools in parallel, then merges the results. However, the merge logic becomes intractable when determining which tool results affect which guidelines.
Reactive/event-driven: This approach allows tool results to trigger new guideline evaluations dynamically. While theoretically appealing, it introduces complexity without clear benefits; sequential iteration is simpler and equally capable.
Tradeoffsβ
| Dimension | Impact |
|---|---|
| Latency | Additional iterations increase response time, though this is mitigated by parallel batching within each iteration. |
| Accuracy | The iterative approach identifies guidelines that a single-pass approach would miss. |
| Cost | Each iteration incurs LLM calls, though this is mitigated by re-evaluating only affected guidelines. |
| Predictability | The max_engine_iterations parameter provides a configurable safety bound. |
The max_iterations Parameterβ
The max_engine_iterations agent parameter (default: 3) controls the maximum loop depth:
- Lower values (1-2): These produce faster responses but may miss some dynamically triggered guidelines.
- Higher values (4-5): These enable more thorough preparation at the cost of higher latency and resource consumption.
- Typical production: Values of 2-3 iterations handle most real-world scenarios effectively.
If the loop reaches the maximum iteration count without achieving a stable state, the engine proceeds to message generation using the context accumulated up to that point. Tool insights will contain information about any blocked or incomplete tool chains.
What's Nextβ
Now that you understand the iteration loop, explore what happens inside each component:
- Guideline Matching: How guidelines are categorized, batched, and matched
- Tool Calling: How tools are inferred, evaluated, and executed
- Message Generation: How accumulated context becomes a response