March 26, 2026 · 13 min read

How to Deploy an AI Agent to Production: VPS, Docker & Serverless (2026)

Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably.

Modern data server room with network racks and cables.

Photo by Brett Sayles on Pexels

This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model

Approach	Best For	Monthly Cost	Complexity	Always-On?
VPS (bare metal)	24/7 autonomous agents	$5-20	Medium	Yes
Docker + VPS	Reproducible, multi-agent	$10-30	Medium-High	Yes
Serverless (Lambda/Cloud Run)	Event-triggered agents	$1-50 (pay-per-use)	Low-Medium	No (triggered)
Managed platforms	No-ops teams	$20-200	Low	Varies

Option 1: VPS Deployment (What We Use)

The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider

Provider	Cheapest Plan	Specs	Best For
Hetzner	$4.50/mo	2 vCPU, 4GB RAM, 40GB SSD	Best value in EU
DigitalOcean	$6/mo	1 vCPU, 1GB RAM, 25GB SSD	Simple UI, good docs
Vultr	$6/mo	1 vCPU, 1GB RAM, 25GB SSD	Global locations
Contabo	$6.50/mo	4 vCPU, 8GB RAM, 50GB SSD	Most specs per dollar

            What Paxrel uses: A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.
        

Step 2: Initial Server Setup

# SSH into your new server
ssh root@your-server-ip

# Create a non-root user
adduser agent
usermod -aG sudo agent

# Install essentials
apt update && apt install -y python3 python3-pip python3-venv git curl

# Switch to agent user
su - agent

# Clone your agent code
git clone https://github.com/your-org/your-agent.git
cd your-agent

# Set up Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Create environment file for credentials
cat > .env << 'EOF'
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
DEEPSEEK_API_KEY=sk-...
EOF
chmod 600 .env

Step 3: Process Manager (systemd)

Use systemd to keep your agent running, restart on crashes, and start on boot:

# /etc/systemd/system/ai-agent.service
[Unit]
Description=AI Agent
After=network.target

[Service]
Type=simple
User=agent
WorkingDirectory=/home/agent/your-agent
EnvironmentFile=/home/agent/your-agent/.env
ExecStart=/home/agent/your-agent/.venv/bin/python3 agent.py
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

# Safety limits
MemoryMax=2G
CPUQuota=80%

[Install]
WantedBy=multi-user.target

# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable ai-agent
sudo systemctl start ai-agent

# Check status
sudo systemctl status ai-agent

# View logs
journalctl -u ai-agent -f --no-pager

Step 4: Cron Scheduling

For agents that run on a schedule (not continuously):

# crontab -e
# Newsletter pipeline: Mon/Wed/Fri at 8am UTC
0 8 * * 1,3,5 cd /home/agent/your-agent && .venv/bin/python3 pipeline.py >> logs/pipeline.log 2>&1

# Social media posting: Every 6 hours
0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1

# Daily monitoring report
30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1

Option 2: Docker Deployment

Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.

# Dockerfile
FROM python:3.12-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl git && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy agent code
COPY . .

# Non-root user for security
RUN useradd -m agent
USER agent

CMD ["python3", "agent.py"]

# docker-compose.yml
version: '3.8'

services:
  agent:
    build: .
    restart: always
    env_file: .env
    volumes:
      - ./data:/app/data      # Persist agent memory/state
      - ./logs:/app/logs       # Persist logs
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: '1.0'
    healthcheck:
      test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"]
      interval: 60s
      timeout: 10s
      retries: 3

  # Optional: vector database for RAG
  chromadb:
    image: chromadb/chroma:latest
    restart: always
    volumes:
      - chroma_data:/chroma/chroma
    ports:
      - "8000:8000"

volumes:
  chroma_data:

# Deploy
docker compose up -d

# View logs
docker compose logs -f agent

# Update agent
git pull && docker compose build && docker compose up -d

Option 3: Serverless Deployment

For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.

AWS Lambda + EventBridge

# handler.py
import json
import boto3

def lambda_handler(event, context):
    """Triggered by EventBridge cron or API Gateway webhook"""

    # Your agent logic here
    from agent import run_agent
    result = run_agent(event)

    return {
        'statusCode': 200,
        'body': json.dumps(result)
    }

# serverless.yml (Serverless Framework)
service: ai-agent

provider:
  name: aws
  runtime: python3.12
  timeout: 300  # 5 minutes max
  memorySize: 512
  environment:
    OPENAI_API_KEY: ${ssm:/ai-agent/openai-key}

functions:
  newsletter:
    handler: handler.lambda_handler
    events:
      - schedule: cron(0 8 ? * MON,WED,FRI *)  # Mon/Wed/Fri 8am
  webhook:
    handler: handler.lambda_handler
    events:
      - httpApi:
          path: /webhook
          method: post

Google Cloud Run

# For longer-running agents (up to 60 min)
gcloud run deploy ai-agent \
  --source . \
  --region us-central1 \
  --memory 1Gi \
  --timeout 3600 \
  --set-env-vars "OPENAI_API_KEY=sk-..." \
  --no-allow-unauthenticated

Platform	Max Runtime	Cold Start	Cost per Run
AWS Lambda	15 minutes	1-5 seconds	$0.0001-0.01
Google Cloud Run	60 minutes	2-10 seconds	$0.001-0.05
Vercel Functions	5 minutes (pro: 15)	< 1 second	$0.0001-0.005
Cloudflare Workers	30 seconds (free)	< 1ms	$0.00005

Monitoring Your Deployed Agent

A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint

from flask import Flask, jsonify
import psutil

app = Flask(__name__)

@app.route('/health')
def health():
    return jsonify({
        "status": "healthy",
        "uptime_hours": get_uptime(),
        "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024,
        "last_run": get_last_run_timestamp(),
        "errors_24h": get_error_count(hours=24),
        "api_balance": check_api_balance()
    })

Alert System

import requests

def send_alert(message, level="warning"):
    """Send alert via Telegram/Slack/email"""
    if level == "critical":
        # Telegram for immediate attention
        requests.post(
            f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
            data={"chat_id": OWNER_ID, "text": f"🚨 {message}"}
        )
    else:
        # Slack webhook for non-critical
        requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"})

# Alerts to configure:
# - Agent crash / restart
# - API balance below threshold
# - Error rate spike (3+ errors in 10 min)
# - Agent stuck (no activity for 2+ hours)
# - Cost spike (daily spend > 2x average)

Log Management

import logging
from logging.handlers import RotatingFileHandler

# Structured logging
handler = RotatingFileHandler(
    'logs/agent.log',
    maxBytes=10_000_000,  # 10MB per file
    backupCount=5          # Keep 5 rotated files
)
handler.setFormatter(logging.Formatter(
    '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))

logger = logging.getLogger('agent')
logger.addHandler(handler)

# Log every significant action
logger.info("Scraping 12 RSS feeds")
logger.info("Scored 97 articles, top score: 28")
logger.warning("API rate limited, retrying in 30s")
logger.error("Beehiiv publish failed: 401 Unauthorized")

Production Hardening Checklist

Security

API keys in environment variables or secrets manager, never in code
Non-root user for the agent process
Firewall: only allow SSH (22) and necessary ports
SSH key auth only, disable password login
Auto-update OS security patches (unattended-upgrades)

Reliability

Process manager with auto-restart (systemd, Docker restart policy)
Graceful shutdown handling (catch SIGTERM, finish current task)
Exponential backoff on API errors (not infinite retry loops)
Circuit breaker for external services (stop calling after N failures)
Daily backup of agent state/memory to external storage

Cost Control

Daily API spend limit with hard cutoff
Max steps per agent run (prevent infinite loops)
Token counting before API calls (reject oversized prompts)
Alert when daily spend exceeds 2x average
Weekly cost report to the team

Deployment Patterns by Use Case

Agent Type	Best Deployment	Why
24/7 autonomous agent	VPS + systemd	Always-on, persistent state
Scheduled pipeline	VPS + cron or serverless	Runs on schedule, sleeps between
Webhook-triggered	Serverless (Lambda/Cloud Run)	Pay-per-use, auto-scales
Multi-agent system	Docker Compose on VPS	Isolated containers, shared network
Customer-facing chatbot	Cloud Run or managed platform	Auto-scale with traffic
Development/testing	Local Docker	Reproducible environment

Key Takeaways

VPS + systemd is the simplest path for always-on agents. $5-15/month, full control, works for 90% of use cases.
Docker adds value when you have complex dependencies, multiple agents, or need reproducibility across environments.
Serverless is cheaper for sporadic workloads but has runtime limits (15 min for Lambda) that don't suit long-running agents.
Monitoring is not optional. Health checks, alerts, and log rotation are the minimum. An unmonitored agent will fail silently.
Security basics matter. Non-root user, env vars for secrets, firewall, SSH keys. Takes 30 minutes, prevents disasters.
Start simple, scale later. A $5 VPS with cron jobs is a perfectly valid production deployment. Don't over-engineer until you need to.

Deploy With Confidence

Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents.

Get the Playbook — $19

Stay Updated on AI Agents

Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam.

Subscribe to AI Agents Weekly

Not ready to buy? Start with Chapter 1 — free

Get the first chapter of The AI Agent Playbook delivered to your inbox. Learn what AI agents really are and see real production examples.

Get Free Chapter →