Skip to main content

Optimization

Optimizing the Execution

Understanding the Engine

Parlant's engine operates in iterative cycles to ensure comprehensive and accurate alignment between your configuration and the agent's behavior. Let's understand how it works, then learn how to optimize it for your needs.

The Response Generation Cycle

When your agent needs to respond to a user, Parlant first captures the conversation context and evaluates which guidelines, tools, and other elements are relevant. But here's where it gets interesting: after running any tools, Parlant re-evaluates the entire context.

Why? Consider a banking agent: A user asks about making a transfer, triggering a guideline to check their balance. The check_balance tool reveals they have less than $500, which activates a new guideline about overdraft warnings. This second guideline wouldn't have been relevant without the tool's output.

This re-evaluation can happen multiple times in a single response. Imagine:

  1. User asks about transfer → Check balance tool runs
  2. Low balance detected → Overdraft warning guideline activates with account upgrade suggestions
  3. Overdraft warning triggers → Account upgrade options tool runs

The Performance Trade-off

While each iteration helps maintain accurate behavior, it also adds latency. In practice, not every use case requires multiple iterations.

Optimizing for Your Needs

You can control this balance by trying different values for your agent's max_engine_iterations setting. Setting it to 1 means your agent will evaluate guidelines and run tools exactly once, with no re-evaluations or multi-step tool calls. This can significantly improve latency, but requires thoughtful tool design.

$ parlant agent update \
--id AGENT_ID \
--max-engine-iterations N

Best Practices

When optimizing for performance, design your tools to be self-contained rather than requiring multiple steps. Instead of separate tools for checking balance and getting account status, combine them into one comprehensive account check. This not only reduces latency but also makes your agent's behavior more predictable.

// Instead of this (conceptual) flow:
const balance = await checkBalance(userId);
if (balance < 500) {
const status = await getAccountStatus(userId);
}

// Design standalone, one-shot tools:
const accountInfo = await getAccountDetails(userId);
// (Returns balance, status, and all relevant info in one call)