Engine Overview
Engine Overview
In Parlant, the Engine is the part that's responsible for agent responses from end to end, where most of Parlant's logic and algorithms reside.
The engine is comprised of several key components, each one dedicated to a particular mission. As we will see, each of these missions assists in creating a well-guided response. At the same time, each of them is also quite complex in its own right.
While this article won't go into each specific component in detail, it will explain them briefly—just enough to understand how they function together to produce consistent responses.
Main Components
There are currently 4 components to the engine:
- Glossary Store: Where domain-specific terms are stored
- Guideline Matcher: Filters matching guidelines for each response
- Tool Caller: Executes matched tool calls
- Message Composer: Tailors a suitable response message
Generally speaking, the Engine—normally activated by the API—utilizes all of these components when generating a response.
Each of these components acts as one part of the whole process of agent response. They are each designed to have a single responsibility in the process, and be independently optimizable, so that when something goes wrong we know exactly which part of the process to turn to and optimize.
Let's briefly consider each of them in its own right.
Glossary Store
This component allows us to store and retrieve relevant terms and definitions that are specific to our business domain.
These terms, or more correctly, the most relevant among them at each particular point in the conversation, are loaded into the execution context and made available to each of the other components. This not only allows the agent to respond in a way that's grounded to your domain's terminology, but also allows you to define guidelines that themselves speak your terminology.
In other words, the fetched glossary terms are imbued into all of the other components, to help them in accomplishing their task more accurately.
Guideline Matcher
Before we explain this component, we first need to understand the motivation for its existence.
As you probably already know, behavior in Parlant is controlled primarily using guidelines, where each guideline is comprised of a condition and an action. The condition is the part that specifies when the action should be followed.
Parlant takes advantage of this condition/action model to help the Tool Caller and the Message Composer stay focused, by only providing them with guidelines that are actually relevant for their current task. For example, if we have a guideline with the condition the customer has just greeted you
, we do not need to account for the action of this guideline if we're already well into the conversation at this point—in which case it can just be ignored.
By avoiding such unneeded instructions, Parlant's engine improves the LLM's focus and accuracy and reduces the complexity to be handled by its supervision mechanism. It also lowers the cost and latency of the LLM's completions by eliminating unneeded tokens.
The Guideline Matcher is what accomplishes this reduction in complexity. It matches the appropriate guidelines that need to be activated in the processing of the agent's next response.
Tool Caller
Instead of using a vendor-provided tool-calling API, Parlant implements its own tool-calling mechanism.
There are four important reasons for this:
- To support as many vendors as possible, including the ability to test other vendors and switch between them while maintaining the exact same user configuration.
- To support guided tool calling, i.e. calling a tool in the context of a specific set of guidelines which explain not just the "what" and "how" of calling the tool, but also the "why" and "when".
- To support multiple preparation iterations when working on a response; for example, where the first iteration matches relevant guidelines and runs tools and then, based on the output of the first iteration's tool calls, match a potentially different or wider set of guidelines, which may come with their own tool calls, and so forth. This allows you to specify guidelines freely and naturally, where their conditions can be met not just based on the conversation itself but also on data coming dynamically from tools. For more info on this, please refer to the Optimization page on Parlant's docs site.
- To enable opportunistic processing optimizations by leveraging the additional guidance-related information Parlant has about tools.
The ToolCaller receives a list of tools—all of the tools that are associated with the currently-matched guidelines—decides which need to be called and how, and runs them, returning the results to the engine.
Message Composer
Finally we come to the component that actually generates the response message (to be exact, it generates zero or more messages, as the case demands).
Essentially, everything up until the Message Composer's turn is considered a preparation for the response—though this preparation may have already produced actions in the real world via tool calls. However, the customer doesn't know about it yet, because the agent still hasn't communicated anything about it (unless the tools themselves emitted messages).
The Message Composer is perhaps the most important component, where every other component basically aims to help it generate the most appropriate message possible.
It receives the relevant glossary terms, the matched guidelines for this particular state of the conversation, the tools that were just called (as well as interesting reasons why relevant or useful tools could not be called), and the entire interaction history. Its job is to further evaluate the matched guidelines in-context, prioritize what the customer needs to hear first in the very next message, and ensure that the guidelines are adhered to as reliably as possible, while at the same time continuing the conversation with the customer as naturally as possible.
Response Lifecycle
Now that we have a basic understanding of what each engine component does, let's look at the lifecycle of a single response. This diagram is somewhat simplistic in terms of the actual architecture, but it does capture the essence of what's happening.
The response cycle is designed to allow us to hook into it at various stages and control it with our own business logic (code), potentially replacing one of the components with our own implementation: say a fine-tuned SLM or an additional filter based on a BERT classifier.