March 26, 2026 · 13 min read

How to Deploy an AI Agent to Production: VPS, Docker & Serverless (2026)

Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably.

This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model

Approach Best For Monthly Cost Complexity Always-On?
VPS (bare metal) 24/7 autonomous agents $5-20 Medium Yes
Docker + VPS Reproducible, multi-agent $10-30 Medium-High Yes
Serverless (Lambda/Cloud Run) Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered)
Managed platforms No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use)

The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider

Provider Cheapest Plan Specs Best For
Hetzner $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU
DigitalOcean $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs
Vultr $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations
Contabo $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar
What Paxrel uses: A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup

# SSH into your new server
ssh root@your-server-ip

# Create a non-root user
adduser agent
usermod -aG sudo agent

# Install essentials
apt update && apt install -y python3 python3-pip python3-venv git curl

# Switch to agent user
su - agent

# Clone your agent code
git clone https://github.com/your-org/your-agent.git
cd your-agent

# Set up Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Create environment file for credentials
cat > .env << 'EOF'
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
DEEPSEEK_API_KEY=sk-...
EOF
chmod 600 .env

Step 3: Process Manager (systemd)

Use systemd to keep your agent running, restart on crashes, and start on boot:

# /etc/systemd/system/ai-agent.service
[Unit]
Description=AI Agent
After=network.target

[Service]
Type=simple
User=agent
WorkingDirectory=/home/agent/your-agent
EnvironmentFile=/home/agent/your-agent/.env
ExecStart=/home/agent/your-agent/.venv/bin/python3 agent.py
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

# Safety limits
MemoryMax=2G
CPUQuota=80%

[Install]
WantedBy=multi-user.target
# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable ai-agent
sudo systemctl start ai-agent

# Check status
sudo systemctl status ai-agent

# View logs
journalctl -u ai-agent -f --no-pager

Step 4: Cron Scheduling

For agents that run on a schedule (not continuously):

# crontab -e
# Newsletter pipeline: Mon/Wed/Fri at 8am UTC
0 8 * * 1,3,5 cd /home/agent/your-agent && .venv/bin/python3 pipeline.py >> logs/pipeline.log 2>&1

# Social media posting: Every 6 hours
0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1

# Daily monitoring report
30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1

Option 2: Docker Deployment

Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.

# Dockerfile
FROM python:3.12-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl git && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy agent code
COPY . .

# Non-root user for security
RUN useradd -m agent
USER agent

CMD ["python3", "agent.py"]
# docker-compose.yml
version: '3.8'

services:
  agent:
    build: .
    restart: always
    env_file: .env
    volumes:
      - ./data:/app/data      # Persist agent memory/state
      - ./logs:/app/logs       # Persist logs
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: '1.0'
    healthcheck:
      test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"]
      interval: 60s
      timeout: 10s
      retries: 3

  # Optional: vector database for RAG
  chromadb:
    image: chromadb/chroma:latest
    restart: always
    volumes:
      - chroma_data:/chroma/chroma
    ports:
      - "8000:8000"

volumes:
  chroma_data:
# Deploy
docker compose up -d

# View logs
docker compose logs -f agent

# Update agent
git pull && docker compose build && docker compose up -d

Option 3: Serverless Deployment

For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.

AWS Lambda + EventBridge

# handler.py
import json
import boto3

def lambda_handler(event, context):
    """Triggered by EventBridge cron or API Gateway webhook"""

    # Your agent logic here
    from agent import run_agent
    result = run_agent(event)

    return {
        'statusCode': 200,
        'body': json.dumps(result)
    }
# serverless.yml (Serverless Framework)
service: ai-agent

provider:
  name: aws
  runtime: python3.12
  timeout: 300  # 5 minutes max
  memorySize: 512
  environment:
    OPENAI_API_KEY: ${ssm:/ai-agent/openai-key}

functions:
  newsletter:
    handler: handler.lambda_handler
    events:
      - schedule: cron(0 8 ? * MON,WED,FRI *)  # Mon/Wed/Fri 8am
  webhook:
    handler: handler.lambda_handler
    events:
      - httpApi:
          path: /webhook
          method: post

Google Cloud Run

# For longer-running agents (up to 60 min)
gcloud run deploy ai-agent \
  --source . \
  --region us-central1 \
  --memory 1Gi \
  --timeout 3600 \
  --set-env-vars "OPENAI_API_KEY=sk-..." \
  --no-allow-unauthenticated
Platform Max Runtime Cold Start Cost per Run
AWS Lambda 15 minutes 1-5 seconds $0.0001-0.01
Google Cloud Run 60 minutes 2-10 seconds $0.001-0.05
Vercel Functions 5 minutes (pro: 15) < 1 second $0.0001-0.005
Cloudflare Workers 30 seconds (free) < 1ms $0.00005

Monitoring Your Deployed Agent

A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint

from flask import Flask, jsonify
import psutil

app = Flask(__name__)

@app.route('/health')
def health():
    return jsonify({
        "status": "healthy",
        "uptime_hours": get_uptime(),
        "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024,
        "last_run": get_last_run_timestamp(),
        "errors_24h": get_error_count(hours=24),
        "api_balance": check_api_balance()
    })

Alert System

import requests

def send_alert(message, level="warning"):
    """Send alert via Telegram/Slack/email"""
    if level == "critical":
        # Telegram for immediate attention
        requests.post(
            f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
            data={"chat_id": OWNER_ID, "text": f"🚨 {message}"}
        )
    else:
        # Slack webhook for non-critical
        requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"})

# Alerts to configure:
# - Agent crash / restart
# - API balance below threshold
# - Error rate spike (3+ errors in 10 min)
# - Agent stuck (no activity for 2+ hours)
# - Cost spike (daily spend > 2x average)

Log Management

import logging
from logging.handlers import RotatingFileHandler

# Structured logging
handler = RotatingFileHandler(
    'logs/agent.log',
    maxBytes=10_000_000,  # 10MB per file
    backupCount=5          # Keep 5 rotated files
)
handler.setFormatter(logging.Formatter(
    '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))

logger = logging.getLogger('agent')
logger.addHandler(handler)

# Log every significant action
logger.info("Scraping 12 RSS feeds")
logger.info("Scored 97 articles, top score: 28")
logger.warning("API rate limited, retrying in 30s")
logger.error("Beehiiv publish failed: 401 Unauthorized")

Production Hardening Checklist

Security

Reliability

Cost Control

Deployment Patterns by Use Case

Agent Type Best Deployment Why
24/7 autonomous agent VPS + systemd Always-on, persistent state
Scheduled pipeline VPS + cron or serverless Runs on schedule, sleeps between
Webhook-triggered Serverless (Lambda/Cloud Run) Pay-per-use, auto-scales
Multi-agent system Docker Compose on VPS Isolated containers, shared network
Customer-facing chatbot Cloud Run or managed platform Auto-scale with traffic
Development/testing Local Docker Reproducible environment

Key Takeaways

Deploy With Confidence

Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents.

Get the Playbook — $29

Stay Updated on AI Agents

Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam.

Subscribe to AI Agents Weekly