Skip to main content

Response Lifecycle

When a customer sends a message, what happens inside the engine before a response emerges? This page explains the preparation iteration loopβ€”the heart of Parlant's response generation.

Why Iterative Preparation?​

The Problem: Single-Pass Matching Misses Context​

Consider a banking agent with these guidelines:

# Guideline 1
condition="Customer asks about their account balance"
action="Call get_balance() and tell them their balance"

# Guideline 2
condition="Customer's balance exceeds $10,000"
action="Recommend our premium investment options"

When a customer asks "How much money do I have?", Guideline 1 clearly matches. But should Guideline 2 match?

At the moment of matching, the balance is unknown because get_balance() has not yet been called. A single-pass approach would fail to identify Guideline 2, even though it becomes applicable once the system retrieves a balance of $15,000.

The Solution: Loop Until Stable​

Parlant addresses this through iteration: the system matches guidelines, executes tools, and then determines whether the new information triggers additional guidelines. This process repeats until reaching a stable state where no new tool calls are required.

This iterative approach enables guidelines to depend on information that does not exist until runtimeβ€”such as tool results, API responses, and database lookupsβ€”without requiring special configuration.

The Preparation Loop​

Here's what happens when the engine processes a customer message:

ALGORITHM: Response Preparation

INPUT: customer_message, session_context
OUTPUT: response_state ready for message generation

1. INITIALIZE response_state:
- Load context_variables (customer, tag, global scopes)
- Load glossary_terms (semantic similarity to conversation)
- Load capabilities (agent abilities relevant to context)

2. iteration = 0

3. WHILE NOT prepared_to_respond AND iteration < max_iterations:

a. MATCH guidelines:
- Predict relevant journeys (Top-K by relevance)
- Prune guidelines to high-probability candidates
- Batch by category (observational, actionable, etc.)
- Evaluate batches in parallel
- Resolve relationships (entailment, suppression, priority)

b. IDENTIFY tool-enabled guidelines
- Separate matched guidelines into:
- ordinary_guidelines (direct actions)
- tool_enabled_guidelines (require tool calls)

c. IF tool_enabled_guidelines exist:
- INFER which tools to call and with what parameters
- EVALUATE each call: NEEDS_TO_RUN, DATA_ALREADY_IN_CONTEXT, CANNOT_RUN
- EXECUTE tools that need to run
- CAPTURE results and insights (missing data, blocked tools)
- RELOAD glossary_terms (new context may need new terms)

d. CHECK if prepared:
- No new tool calls were generated, AND
- No pending journey transitions

e. iteration += 1

4. RETURN response_state to message generation

Entry and Exit Conditions​

Entry: The loop begins after the engine acknowledges the incoming message and initializes context. At this point, prepared_to_respond is false.

Exit: The loop exits when one of these conditions is met:

  • Stable state: No tool-enabled guidelines matched, or all matched tools returned DATA_ALREADY_IN_CONTEXT
  • Max iterations reached: The agent.max_engine_iterations limit was hit (safety valve)

Example: Three-Iteration Response​

Each iteration builds upon the context established by previous iterations. The engine proceeds to message generation only after gathering all information necessary for a complete response.

ResponseState: What Accumulates​

The ResponseState structure tracks everything collected across iterations:

FieldDescription
context_variablesCustomer, tag, and global variables loaded at start
glossary_termsDomain terms relevant to current conversation
capabilitiesAgent abilities semantically matched to context
iterationsList of completed iteration states
ordinary_guideline_matchesNon-tool-enabled guidelines to follow
tool_enabled_guideline_matchesGuidelines with associated tools
journeysActive journey definitions
journey_pathsCurrent position in each active journey
tool_eventsAll executed tool calls and their results
tool_insightsInformation about blocked or failed tools
prepared_to_respondFlag indicating readiness
message_eventsGenerated messages (staged before emission)

State Flow​

Each iteration adds to the accumulated state rather than replacing it. Journey paths are updated based on matched journey nodes, tool results accumulate across iterations, and glossary terms may expand when new tool results introduce relevant terminology.

The EngineContext​

While ResponseState holds mutable state for a single response, EngineContext is the container that threads through the entire pipeline:

EngineContext
β”œβ”€β”€ info (original request context)
β”œβ”€β”€ logger, tracer, meter (observability)
β”œβ”€β”€ agent (loaded agent configuration)
β”œβ”€β”€ customer (loaded customer data)
β”œβ”€β”€ session (session state and events)
β”œβ”€β”€ interaction (event history)
β”œβ”€β”€ state (ResponseState - mutable)
β”œβ”€β”€ session_event_emitter (for session-level events)
└── response_event_emitter (for response-level events)

Hooks receive the EngineContext and can:

  • Read any accumulated state
  • Modify ResponseState fields
  • Emit custom events
  • Signal cancellation

See Engine Extensions for hook implementation details.

Why This Design?​

Alternatives Considered​

Single-pass matching: This approach matches all guidelines once, calls all tools once, and then generates a response. However, it proves inadequate because guidelines often depend on information that only becomes available after tool calls execute.

Parallel-all: This approach matches guidelines and calls tools in parallel, then merges the results. However, the merge logic becomes intractable when determining which tool results affect which guidelines.

Reactive/event-driven: This approach allows tool results to trigger new guideline evaluations dynamically. While theoretically appealing, it introduces complexity without clear benefits; sequential iteration is simpler and equally capable.

Tradeoffs​

DimensionImpact
LatencyAdditional iterations increase response time, though this is mitigated by parallel batching within each iteration.
AccuracyThe iterative approach identifies guidelines that a single-pass approach would miss.
CostEach iteration incurs LLM calls, though this is mitigated by re-evaluating only affected guidelines.
PredictabilityThe max_engine_iterations parameter provides a configurable safety bound.

The max_iterations Parameter​

The max_engine_iterations agent parameter (default: 3) controls the maximum loop depth:

  • Lower values (1-2): These produce faster responses but may miss some dynamically triggered guidelines.
  • Higher values (4-5): These enable more thorough preparation at the cost of higher latency and resource consumption.
  • Typical production: Values of 2-3 iterations handle most real-world scenarios effectively.

If the loop reaches the maximum iteration count without achieving a stable state, the engine proceeds to message generation using the context accumulated up to that point. Tool insights will contain information about any blocked or incomplete tool chains.

What's Next​

Now that you understand the iteration loop, explore what happens inside each component: