Skip to main content

Deployment Guide

Your Parlant agent works great locally, but production deployment requires a few key changes. This guide walks you through setting up Parlant on Kubernetes with proper authentication, persistence, and scaling—whether you're deploying to AWS, Azure, or another cloud provider.

What This Guide Covers

This guide focuses on the infrastructure and deployment aspects unique to taking Parlant from local development to production. For topics like authentication policies, frontend integration, and agentic design principles, we'll reference the relevant documentation sections.

By the end of this guide, you'll have:

  • A containerized Parlant application
  • A production-ready Kubernetes deployment
  • MongoDB persistence configured
  • Load balancing and HTTPS termination
  • A scalable, secure production environment
github Need deployment help?

Architecture Overview

Here's what a typical Parlant production deployment looks like:

Key Components:

  • Load Balancer: Handles SSL termination and routes traffic to Parlant pods
  • Parlant Pods: Stateless application containers (horizontally scalable)
  • MongoDB: Persistent storage for sessions and customer data
  • LLM Provider: External API for NLP services (OpenAI, Anthropic, etc.)

Prerequisites

Before you begin, ensure you have:

Local Tools:

  • Python 3.10 or higher
  • Docker installed and running
  • kubectl CLI tool
  • Cloud provider CLI (AWS CLI or Azure CLI)
  • A code editor

Cloud Resources:

  • Access to AWS EKS or Azure AKS (or another Kubernetes provider)
  • A MongoDB instance (MongoDB Atlas recommended, or managed MongoDB from your cloud provider)
  • (Optional) A domain name for your agent
  • (Optional) SSL certificate (can use Let's Encrypt or cloud provider certificates)

Knowledge Prerequisites:

  • Basic understanding of Kubernetes concepts (pods, services, deployments)
  • Familiarity with environment variables and configuration management
  • Basic Docker knowledge
Starting Point

This guide assumes you have a working Parlant agent running locally. If you haven't built your agent yet, start with the Installation guide.

Understanding Parlant's Production Requirements

Stateless Architecture

Parlant's server is designed to be stateless, which means:

  • All session state is stored in MongoDB, not in memory
  • Multiple Parlant pods can run simultaneously without coordination
  • You can scale horizontally by adding more pods
  • Pods can be restarted or replaced without losing data

This design makes Parlant naturally suited for cloud deployment and Kubernetes orchestration.

Persistence Layer

Parlant requires two MongoDB collections:

  1. Sessions: Stores conversation state, events, and history
  2. Customers: Stores customer profiles and associated data

Both collections must be accessible from all Parlant pods with consistent connection strings.

Port Configuration

By default, Parlant's FastAPI server listens on port 8800. In production:

  • Your load balancer accepts HTTPS traffic on port 443
  • The load balancer forwards to Parlant pods on port 8800
  • Kubernetes services handle internal routing

Step 1: Prepare Your Production Application

Create a Production Configuration File

Create a production_config.py file to centralize your production settings:

# production_config.py
import os
import parlant.sdk as p

# MongoDB Configuration
MONGODB_SESSIONS_URI = os.environ["MONGODB_SESSIONS_URI"]
MONGODB_CUSTOMERS_URI = os.environ["MONGODB_CUSTOMERS_URI"]

# NLP Provider Configuration
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY")

# Server Configuration
SERVER_HOST = os.environ.get("SERVER_HOST", "0.0.0.0")
SERVER_PORT = int(os.environ.get("SERVER_PORT", "8800"))

# Choose your NLP service
NLP_SERVICE = p.NLPServices.openai # or p.NLPServices.anthropic

def get_mongodb_config():
"""Returns MongoDB configuration for Parlant."""
return {
"sessions_uri": MONGODB_SESSIONS_URI,
"customers_uri": MONGODB_CUSTOMERS_URI,
}

Update Your Main Application File

Modify your main application to use production configuration:

# main.py
import asyncio
import parlant.sdk as p
from production_config import (
get_mongodb_config,
NLP_SERVICE,
SERVER_HOST,
SERVER_PORT
)
from auth import ProductionAuthPolicy # We'll create this next


async def configure_container(container: p.Container) -> p.Container:
"""Configure production-specific dependencies."""

# Set up production authorization
container[p.AuthorizationPolicy] = ProductionAuthPolicy(
secret_key=os.environ["JWT_SECRET_KEY"],
)

return container


async def main():
"""Initialize and run the Parlant server."""

# MongoDB configuration
mongodb_config = get_mongodb_config()

async with p.Server(
host=SERVER_HOST,
port=SERVER_PORT,
nlp_service=NLP_SERVICE,
configure_container=configure_container,
**mongodb_config
) as server:

# Create or retrieve your agent
agents = await server.list_agents()

if not agents:
agent = await server.create_agent(
name="Production Agent",
description="Your agent description here"
)

# Set up your guidelines, journeys, etc.
await setup_agent_behavior(agent)

# Start serving requests
await server.serve()


async def setup_agent_behavior(agent: p.Agent):
"""Configure your agent's behavior."""
# Your guidelines, journeys, tools, etc.
pass


if __name__ == "__main__":
asyncio.run(main())

Set Up Production Authorization

Create an auth.py file with your production authorization policy:

# auth.py
import parlant.sdk as p


class ProductionAuthPolicy(p.ProductionAuthorizationPolicy):
"""Production authorization with your custom rules."""

def __init__(self, secret_key: str):
super().__init__()
self.secret_key = secret_key
# Add your custom authorization logic here
Authentication & Rate Limiting

For comprehensive guidance on implementing JWT authentication, rate limiting, M2M tokens, and custom authorization policies, see the API Hardening guide.

Step 2: Containerize Your Application

Create an Optimized Dockerfile

Create a Dockerfile in your project root:

# Use Python 3.10 slim image
FROM python:3.10-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
build-essential \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements first (for better caching)
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose Parlant's default port
EXPOSE 8800

# Health check endpoint
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8800/health')" || exit 1

# Run the application
CMD ["python", "main.py"]

Create Requirements File

Your requirements.txt should include:

parlant>=3.0.0
pyjwt>=2.8.0
python-limits>=3.0.0
pymongo>=4.0.0
redis>=5.0.0

Build and Test Locally

Build your Docker image:

docker build -t parlant-agent:latest .

Test it locally with environment variables:

docker run -p 8800:8800 \
-e MONGODB_SESSIONS_URI="mongodb://localhost:27017/parlant_sessions" \
-e MONGODB_CUSTOMERS_URI="mongodb://localhost:27017/parlant_customers" \
-e OPENAI_API_KEY="your-key-here" \
-e JWT_SECRET_KEY="your-secret-here" \
parlant-agent:latest

Visit http://localhost:8800 to verify it's working.

Optimize Image Size (Optional)

For production, consider a multi-stage build to reduce image size. For more on optimizing Docker builds, see Docker's multi-stage build documentation.

# Stage 1: Builder
FROM python:3.10-slim as builder

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# Stage 2: Runtime
FROM python:3.10-slim

WORKDIR /app

# Copy only the dependencies from builder
COPY --from=builder /root/.local /root/.local
COPY . .

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

EXPOSE 8800

CMD ["python", "main.py"]

Step 3: Set Up MongoDB

You have two main options for MongoDB in production:

MongoDB Atlas is a fully managed service that handles backups, scaling, and maintenance. For a complete setup guide, see the official MongoDB Atlas Getting Started documentation.

Quick setup:

  1. Create a MongoDB Atlas account at https://www.mongodb.com/cloud/atlas
  2. Create a cluster (free tier works for development, paid tier for production)
  3. Set up database access with a user that has read/write permissions
  4. Configure network access for your Kubernetes cluster's IP range
  5. Get your connection string:
    mongodb+srv://username:password@cluster0.xxxxx.mongodb.net/?retryWrites=true&w=majority

You'll need two connection strings (or one string with different database names):

# Option 1: Two separate databases
MONGODB_SESSIONS_URI=mongodb+srv://user:pass@cluster0.xxxxx.mongodb.net/parlant_sessions
MONGODB_CUSTOMERS_URI=mongodb+srv://user:pass@cluster0.xxxxx.mongodb.net/parlant_customers

# Option 2: Same database, Parlant will create collections
MONGODB_SESSIONS_URI=mongodb+srv://user:pass@cluster0.xxxxx.mongodb.net/parlant
MONGODB_CUSTOMERS_URI=mongodb+srv://user:pass@cluster0.xxxxx.mongodb.net/parlant

Option B: Self-Hosted MongoDB on Kubernetes

For advanced users who need full control. For detailed guidance, see the official Kubernetes documentation on running MongoDB with StatefulSets.

# mongodb-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: mongodb
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
app: mongodb
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongodb
spec:
serviceName: "mongodb"
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo:7.0
ports:
- containerPort: 27017
volumeMounts:
- name: mongodb-data
mountPath: /data/db
env:
- name: MONGO_INITDB_ROOT_USERNAME
valueFrom:
secretKeyRef:
name: mongodb-secret
key: username
- name: MONGO_INITDB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mongodb-secret
key: password
volumeClaimTemplates:
- metadata:
name: mongodb-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 20Gi
Production MongoDB

For production workloads, consider using a managed MongoDB service or implementing a proper replica set with backups, monitoring, and disaster recovery.

Step 4: Deploy to Kubernetes

Now we'll deploy Parlant to a Kubernetes cluster. We'll show examples for both AWS EKS and Azure AKS.

Create Kubernetes Secrets

First, create a Kubernetes Secret with your sensitive configuration:

# parlant-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: parlant-secret
namespace: default
type: Opaque
stringData:
mongodb-sessions-uri: "mongodb+srv://user:pass@cluster0.xxxxx.mongodb.net/parlant_sessions"
mongodb-customers-uri: "mongodb+srv://user:pass@cluster0.xxxxx.mongodb.net/parlant_customers"
openai-api-key: "sk-..."
jwt-secret-key: "your-secure-random-string"

Apply it:

kubectl apply -f parlant-secret.yaml
Secret Management

For production, use a proper secrets management solution like AWS Secrets Manager, Azure Key Vault, or HashiCorp Vault instead of storing secrets directly in YAML files. See Kubernetes Secrets best practices for comprehensive guidance on secret management.

Create ConfigMap for Non-Sensitive Config

# parlant-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: parlant-config
namespace: default
data:
SERVER_HOST: "0.0.0.0"
SERVER_PORT: "8800"

Apply it:

kubectl apply -f parlant-configmap.yaml

Create Deployment

# parlant-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: parlant
namespace: default
labels:
app: parlant
spec:
replicas: 3
selector:
matchLabels:
app: parlant
template:
metadata:
labels:
app: parlant
spec:
containers:
- name: parlant
image: your-registry/parlant-agent:latest
ports:
- containerPort: 8800
name: http
env:
- name: MONGODB_SESSIONS_URI
valueFrom:
secretKeyRef:
name: parlant-secret
key: mongodb-sessions-uri
- name: MONGODB_CUSTOMERS_URI
valueFrom:
secretKeyRef:
name: parlant-secret
key: mongodb-customers-uri
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: parlant-secret
key: openai-api-key
- name: JWT_SECRET_KEY
valueFrom:
secretKeyRef:
name: parlant-secret
key: jwt-secret-key
envFrom:
- configMapRef:
name: parlant-config
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8800
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 8800
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3

Apply it:

kubectl apply -f parlant-deployment.yaml

Create Service

# parlant-service.yaml
apiVersion: v1
kind: Service
metadata:
name: parlant-service
namespace: default
spec:
type: ClusterIP
selector:
app: parlant
ports:
- port: 8800
targetPort: 8800
protocol: TCP
name: http

Apply it:

kubectl apply -f parlant-service.yaml

Set Up Ingress for Load Balancing

The ingress configuration differs slightly between AWS and Azure.

For AWS EKS, use the AWS Load Balancer Controller with an Application Load Balancer:

First, install the AWS Load Balancer Controller:

# Add the EKS chart repo
helm repo add eks https://aws.github.io/eks-charts
helm repo update

# Install the controller
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=your-cluster-name \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller

Create the Ingress:

# parlant-ingress-aws.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: parlant-ingress
namespace: default
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'
alb.ingress.kubernetes.io/ssl-redirect: '443'
# If you have an ACM certificate:
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:region:account:certificate/xxxxx
spec:
rules:
- host: your-domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: parlant-service
port:
number: 8800

Apply it:

kubectl apply -f parlant-ingress-aws.yaml

Your ALB will be created automatically. Get the address:

kubectl get ingress parlant-ingress

Verify Deployment

Check that everything is running:

# Check pods
kubectl get pods -l app=parlant

# Check service
kubectl get service parlant-service

# Check ingress
kubectl get ingress parlant-ingress

# View logs
kubectl logs -l app=parlant --tail=100

# Follow logs in real-time
kubectl logs -l app=parlant -f

You should see pods in "Running" state and ready (e.g., 3/3).

Step 5: Configure Your Frontend

Once your Parlant backend is deployed, connect your frontend to it.

For React applications, use the official parlant-chat-react widget pointing to your production URL. For custom integrations or other frameworks, see the Custom Frontend guide for detailed instructions on:

  • Event-driven conversation API
  • Session management
  • Message handling
  • Real-time updates with long polling
  • CORS configuration

Step 6: Production Hardening

Implement Authorization Policy

Set up a production-ready authorization policy with JWT authentication and rate limiting. See the API Hardening guide for:

  • Custom authorization policy implementation
  • JWT token validation
  • Rate limiting configuration
  • M2M token support

Set Up Input Moderation

Implement input moderation to prevent abuse. See the Input Moderation guide for details on content filtering and safety checks.

Configure Human Handoff

For scenarios where the AI agent needs to escalate to a human agent, see the Human Handoff guide.

Network Policies

Create network policies to restrict traffic:

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: parlant-network-policy
namespace: default
spec:
podSelector:
matchLabels:
app: parlant
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8800
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443 # HTTPS for external APIs
- protocol: TCP
port: 27017 # MongoDB

Step 7: Scaling and Performance

Horizontal Pod Autoscaling

Configure autoscaling based on CPU/memory usage:

# parlant-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: parlant-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: parlant
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80

Apply it:

kubectl apply -f parlant-hpa.yaml

Resource Management

The deployment above includes resource requests and limits. Adjust these based on your workload:

resources:
requests:
memory: "512Mi" # Minimum guaranteed
cpu: "250m" # 0.25 CPU cores
limits:
memory: "2Gi" # Maximum allowed
cpu: "1000m" # 1 CPU core max

MongoDB Performance

For MongoDB Atlas:

  • Use an appropriate tier (M10+ for production)
  • Enable connection pooling (handled automatically by Parlant)
  • Set up monitoring and alerts

For self-hosted MongoDB:

  • Use a replica set for high availability
  • Configure appropriate WiredTiger cache size
  • Set up regular backups

Caching Strategies

Parlant's guideline matching engine includes built-in caching. For additional performance:

# In your configure_container function
from limits.storage import RedisStorage

container[p.RateLimiter] = p.BasicRateLimiter(
storage=RedisStorage("redis://redis-host:6379"),
# ... other configuration
)

Step 8: Monitoring and Observability

Health Checks

Parlant exposes a /health endpoint. Monitor it:

curl https://your-domain.com/health

Expected response:

{"status": "healthy"}

Logging

View application logs:

# All pods
kubectl logs -l app=parlant --tail=100

# Specific pod
kubectl logs parlant-xxxxxxxxx-xxxxx --tail=100

# Stream logs
kubectl logs -l app=parlant -f

Metrics Collection

Install Prometheus and Grafana for metrics:

# Add Prometheus helm repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Install Prometheus + Grafana
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace

Set up alerts for:

  • Pod restarts and crashes
  • High memory/CPU usage
  • Response time degradation
  • MongoDB connection failures
  • LLM API rate limits or errors

Step 9: CI/CD Integration

A typical CI/CD pipeline for Parlant deployment looks like this:

GitHub Actions Example

Create .github/workflows/deploy.yml:

name: Deploy to Production

on:
push:
branches:
- main

jobs:
deploy:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3

- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1

- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1

- name: Build and push Docker image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: parlant-agent
IMAGE_TAG: ${{ github.sha }}
run: |
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
docker tag $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG $ECR_REGISTRY/$ECR_REPOSITORY:latest
docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest

- name: Update kubeconfig
run: aws eks update-kubeconfig --name your-cluster-name --region us-east-1

- name: Deploy to Kubernetes
run: |
kubectl set image deployment/parlant parlant=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
kubectl rollout status deployment/parlant

GitLab CI Example

Create .gitlab-ci.yml:

stages:
- build
- deploy

variables:
DOCKER_DRIVER: overlay2
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA

build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker build -t $IMAGE_TAG .
- docker push $IMAGE_TAG
- docker tag $IMAGE_TAG $CI_REGISTRY_IMAGE:latest
- docker push $CI_REGISTRY_IMAGE:latest
only:
- main

deploy:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl config use-context your-cluster-context
- kubectl set image deployment/parlant parlant=$IMAGE_TAG
- kubectl rollout status deployment/parlant
only:
- main

Rolling Updates

Kubernetes handles rolling updates automatically. Configure the strategy:

spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1

This ensures zero-downtime deployments.

Step 10: Troubleshooting

Troubleshooting Flow

Use this decision tree to diagnose common deployment issues:

Common Issues

Pods Not Starting

Check pod status and events:

kubectl describe pod parlant-xxxxxxxxx-xxxxx

Common causes:

  • Image pull errors (check registry credentials)
  • Resource limits (insufficient memory/CPU)
  • Environment variable issues

MongoDB Connection Errors

Check connection string format and network access:

# Test from within a pod
kubectl exec -it parlant-xxxxxxxxx-xxxxx -- python -c "import pymongo; client = pymongo.MongoClient('your-connection-string'); print(client.server_info())"

Verify:

  • Connection string format is correct
  • MongoDB cluster allows connections from Kubernetes IP range
  • Credentials are valid

Load Balancer Not Working

Check ingress status:

kubectl describe ingress parlant-ingress

Verify:

  • Ingress controller is installed and running
  • SSL certificate is configured correctly
  • DNS is pointing to the load balancer

High Response Latency

Common causes and solutions:

  1. Guideline matching overhead: Review your agent's guidelines and optimize
  2. MongoDB performance: Check indexes and query performance
  3. LLM API latency: Consider using faster models or caching
  4. Insufficient resources: Scale up pods or increase resource limits

Check pod metrics:

kubectl top pods -l app=parlant

Memory Leaks

Monitor memory usage over time:

kubectl top pod parlant-xxxxxxxxx-xxxxx --use-protocol-buffers

If memory grows continuously, check:

  • Large session histories (implement cleanup)
  • Caching configuration
  • Connection pool settings

Debug Mode

Enable debug logging by setting an environment variable:

env:
- name: LOG_LEVEL
value: "DEBUG"

Getting Help

If you encounter issues not covered here:

github Get deployment support

You can also:

Production Checklist

Deployment Verification Flow

Use this checklist to verify your deployment is production-ready:

Pre-Launch Checklist

Before going live, verify you have:

Security:

  • Production authorization policy implemented
  • JWT authentication configured
  • Rate limiting enabled
  • Secrets stored securely (not in version control)
  • Network policies configured
  • HTTPS/TLS enabled
  • Input moderation configured

Reliability:

  • MongoDB backups configured
  • Health checks and probes configured
  • Resource limits set appropriately
  • Horizontal pod autoscaling enabled
  • Multiple replicas running

Monitoring:

  • Logging configured and accessible
  • Metrics collection set up
  • Alerts configured for critical issues
  • Health check monitoring enabled

Operations:

  • CI/CD pipeline configured
  • Deployment documentation written
  • Rollback procedure tested
  • Disaster recovery plan documented

Performance:

  • Load testing completed
  • Resource allocation optimized
  • MongoDB indexes configured
  • Caching configured appropriately

Next Steps

Now that your Parlant agent is deployed, consider:

  1. Review Agentic Design Principles: See the Agentic Design Methodology guide for best practices on building effective agents

  2. Implement Monitoring: Set up comprehensive monitoring and alerting for production issues

  3. Plan for Scale: Test your deployment under expected load and adjust resources accordingly

  4. Iterate on Agent Behavior: Use production feedback to refine guidelines, journeys, and tools

  5. Document Your Setup: Maintain documentation for your team on deployment procedures and troubleshooting

github Questions about deployment?

Congratulations! You now have a production-ready Parlant deployment. Your AI agent is ready to handle real customer conversations at scale.