Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably.
This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.
| Approach | Best For | Monthly Cost | Complexity | Always-On? |
|---|---|---|---|---|
| VPS (bare metal) | 24/7 autonomous agents | $5-20 | Medium | Yes |
| Docker + VPS | Reproducible, multi-agent | $10-30 | Medium-High | Yes |
| Serverless (Lambda/Cloud Run) | Event-triggered agents | $1-50 (pay-per-use) | Low-Medium | No (triggered) |
| Managed platforms | No-ops teams | $20-200 | Low | Varies |
The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.
| Provider | Cheapest Plan | Specs | Best For |
|---|---|---|---|
| Hetzner | $4.50/mo | 2 vCPU, 4GB RAM, 40GB SSD | Best value in EU |
| DigitalOcean | $6/mo | 1 vCPU, 1GB RAM, 25GB SSD | Simple UI, good docs |
| Vultr | $6/mo | 1 vCPU, 1GB RAM, 25GB SSD | Global locations |
| Contabo | $6.50/mo | 4 vCPU, 8GB RAM, 50GB SSD | Most specs per dollar |
# SSH into your new server
ssh root@your-server-ip
# Create a non-root user
adduser agent
usermod -aG sudo agent
# Install essentials
apt update && apt install -y python3 python3-pip python3-venv git curl
# Switch to agent user
su - agent
# Clone your agent code
git clone https://github.com/your-org/your-agent.git
cd your-agent
# Set up Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Create environment file for credentials
cat > .env << 'EOF'
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
DEEPSEEK_API_KEY=sk-...
EOF
chmod 600 .env
Use systemd to keep your agent running, restart on crashes, and start on boot:
# /etc/systemd/system/ai-agent.service
[Unit]
Description=AI Agent
After=network.target
[Service]
Type=simple
User=agent
WorkingDirectory=/home/agent/your-agent
EnvironmentFile=/home/agent/your-agent/.env
ExecStart=/home/agent/your-agent/.venv/bin/python3 agent.py
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
# Safety limits
MemoryMax=2G
CPUQuota=80%
[Install]
WantedBy=multi-user.target
# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable ai-agent
sudo systemctl start ai-agent
# Check status
sudo systemctl status ai-agent
# View logs
journalctl -u ai-agent -f --no-pager
For agents that run on a schedule (not continuously):
# crontab -e
# Newsletter pipeline: Mon/Wed/Fri at 8am UTC
0 8 * * 1,3,5 cd /home/agent/your-agent && .venv/bin/python3 pipeline.py >> logs/pipeline.log 2>&1
# Social media posting: Every 6 hours
0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1
# Daily monitoring report
30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1
Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.
# Dockerfile
FROM python:3.12-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl git && rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy agent code
COPY . .
# Non-root user for security
RUN useradd -m agent
USER agent
CMD ["python3", "agent.py"]
# docker-compose.yml
version: '3.8'
services:
agent:
build: .
restart: always
env_file: .env
volumes:
- ./data:/app/data # Persist agent memory/state
- ./logs:/app/logs # Persist logs
deploy:
resources:
limits:
memory: 2G
cpus: '1.0'
healthcheck:
test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"]
interval: 60s
timeout: 10s
retries: 3
# Optional: vector database for RAG
chromadb:
image: chromadb/chroma:latest
restart: always
volumes:
- chroma_data:/chroma/chroma
ports:
- "8000:8000"
volumes:
chroma_data:
# Deploy
docker compose up -d
# View logs
docker compose logs -f agent
# Update agent
git pull && docker compose build && docker compose up -d
For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.
# handler.py
import json
import boto3
def lambda_handler(event, context):
"""Triggered by EventBridge cron or API Gateway webhook"""
# Your agent logic here
from agent import run_agent
result = run_agent(event)
return {
'statusCode': 200,
'body': json.dumps(result)
}
# serverless.yml (Serverless Framework)
service: ai-agent
provider:
name: aws
runtime: python3.12
timeout: 300 # 5 minutes max
memorySize: 512
environment:
OPENAI_API_KEY: ${ssm:/ai-agent/openai-key}
functions:
newsletter:
handler: handler.lambda_handler
events:
- schedule: cron(0 8 ? * MON,WED,FRI *) # Mon/Wed/Fri 8am
webhook:
handler: handler.lambda_handler
events:
- httpApi:
path: /webhook
method: post
# For longer-running agents (up to 60 min)
gcloud run deploy ai-agent \
--source . \
--region us-central1 \
--memory 1Gi \
--timeout 3600 \
--set-env-vars "OPENAI_API_KEY=sk-..." \
--no-allow-unauthenticated
| Platform | Max Runtime | Cold Start | Cost per Run |
|---|---|---|---|
| AWS Lambda | 15 minutes | 1-5 seconds | $0.0001-0.01 |
| Google Cloud Run | 60 minutes | 2-10 seconds | $0.001-0.05 |
| Vercel Functions | 5 minutes (pro: 15) | < 1 second | $0.0001-0.005 |
| Cloudflare Workers | 30 seconds (free) | < 1ms | $0.00005 |
A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:
from flask import Flask, jsonify
import psutil
app = Flask(__name__)
@app.route('/health')
def health():
return jsonify({
"status": "healthy",
"uptime_hours": get_uptime(),
"memory_mb": psutil.Process().memory_info().rss / 1024 / 1024,
"last_run": get_last_run_timestamp(),
"errors_24h": get_error_count(hours=24),
"api_balance": check_api_balance()
})
import requests
def send_alert(message, level="warning"):
"""Send alert via Telegram/Slack/email"""
if level == "critical":
# Telegram for immediate attention
requests.post(
f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
data={"chat_id": OWNER_ID, "text": f"🚨 {message}"}
)
else:
# Slack webhook for non-critical
requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"})
# Alerts to configure:
# - Agent crash / restart
# - API balance below threshold
# - Error rate spike (3+ errors in 10 min)
# - Agent stuck (no activity for 2+ hours)
# - Cost spike (daily spend > 2x average)
import logging
from logging.handlers import RotatingFileHandler
# Structured logging
handler = RotatingFileHandler(
'logs/agent.log',
maxBytes=10_000_000, # 10MB per file
backupCount=5 # Keep 5 rotated files
)
handler.setFormatter(logging.Formatter(
'%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))
logger = logging.getLogger('agent')
logger.addHandler(handler)
# Log every significant action
logger.info("Scraping 12 RSS feeds")
logger.info("Scored 97 articles, top score: 28")
logger.warning("API rate limited, retrying in 30s")
logger.error("Beehiiv publish failed: 401 Unauthorized")
unattended-upgrades)| Agent Type | Best Deployment | Why |
|---|---|---|
| 24/7 autonomous agent | VPS + systemd | Always-on, persistent state |
| Scheduled pipeline | VPS + cron or serverless | Runs on schedule, sleeps between |
| Webhook-triggered | Serverless (Lambda/Cloud Run) | Pay-per-use, auto-scales |
| Multi-agent system | Docker Compose on VPS | Isolated containers, shared network |
| Customer-facing chatbot | Cloud Run or managed platform | Auto-scale with traffic |
| Development/testing | Local Docker | Reproducible environment |
Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents.
Get the Playbook — $29Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam.
Subscribe to AI Agents Weekly