Docker + AI: Containerizing Your Intelligent Services
You have a server running an API, a database, a reverse proxy, and maybe a language model. Everything is installed directly on the OS. One day, you update Python and everything breaks. Your API won't start, your dependencies are in conflict, and you spend 4 hours fixing everything.
With Docker, this scenario is a thing of the past. Each service runs in its isolated container, with its own dependencies, its own Python version, and its own environment. Update whatever you want — the other services remain unaffected.
In this guide, we'll containerize concrete AI services: a FastAPI API, an LLM pipeline, and the entire ecosystem around them. Real, clean, and reproducible self-hosting.
🐳 Why Docker for AI
The 4 Pillars
| Pillar | Without Docker | With Docker |
|---|---|---|
| Isolation | Python 3.11 breaks your Python 3.9 app | Each service has its own version |
| Reproducibility | "It works on my machine" | Same image = same result everywhere |
| Scaling | Restart the entire server | Scale just the service that needs it |
| Deployment | 47 commands to type in order | docker compose up -d |
Why It's Particularly Important for AI
AI projects have heavy and conflicting dependencies:
# Project A wants:
torch==2.1.0
numpy==1.24.0
transformers==4.35.0
# Project B wants:
torch==1.13.0
numpy==1.21.0
tensorflow==2.14.0
Without Docker, it's a nightmare of virtual environments. With Docker:
# Project A
docker run -d --name project-a project-a:latest
# Project B (completely different versions, no conflicts)
docker run -d --name project-b project-b:latest
Docker vs VM
| Aspect | Docker | VM (VirtualBox, etc.) |
|---|---|---|
| Startup | Seconds | Minutes |
| Size | MBs (image) | GBs (full OS) |
| Performance | Near-native | Virtualization overhead |
| Isolation | Processes | Full OS |
| AI Use Case | ✅ Ideal | Overkill |
📦 Docker Basics in 5 Minutes
Essential Vocabulary
Image = The build plan (like a class)
Container = The running instance (like an object)
Dockerfile = The recipe to create an image
Volume = Persistent storage (survives container restart)
Network = Virtual network between containers
Compose = Multi-container orchestration
Survival Commands
# See running containers
docker ps
# See ALL containers (including stopped ones)
docker ps -a
# See downloaded images
docker images
# Logs of a container
docker logs my-container
docker logs -f my-container # Follow (real-time)
# Enter a container
docker exec -it my-container bash
# Stop / remove
docker stop my-container
docker rm my-container
# Clean up (unused images/containers)
docker system prune -a
Installing Docker
# Ubuntu/Debian
curl -fsSL https://get.docker.com | sh
# Add your user to the docker group (avoids sudo)
sudo usermod -aG docker $USER
# Install Docker Compose (included in Docker Desktop, otherwise :)
sudo apt install docker-compose-plugin
# Verify
docker --version
docker compose version
🏗️ Docker Compose: Orchestrating Multiple Services
Why Compose?
A typical AI project needs:
- An API (FastAPI, Flask)
- A database (SQLite, PostgreSQL)
- A reverse proxy (Nginx, Caddy)
- Maybe a cache (Redis)
- Maybe a worker (Celery, async tasks)
Managing all this with individual docker run commands is unmanageable. Docker Compose defines everything in a single file:
# docker-compose.yml
version: '3.8'
services:
api:
build: ./api
ports:
- "8000:8000"
volumes:
- ./data:/app/data
environment:
- DATABASE_URL=sqlite:///data/app.db
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
depends_on:
- db
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
db:
image: postgres:16-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_DB=myapp
- POSTGRES_USER=app
- POSTGRES_PASSWORD=${DB_PASSWORD}
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app"]
interval: 10s
timeout: 5s
retries: 5
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/certs:/etc/nginx/certs:ro
depends_on:
- api
restart: unless-stopped
volumes:
pgdata:
Essential Compose Commands
# Start all services
docker compose up -d
# See logs of all services
docker compose logs -f
# Logs of a specific service
docker compose logs -f api
# Restart a service
docker compose restart api
# Stop everything
docker compose down
# Stop everything AND remove volumes (WARNING: data loss)
docker compose down -v
# Rebuild after modifying the Dockerfile
docker compose up -d --build
🤖 Example 1: FastAPI + SQLite Containerized API
Project Structure
my-ai-api/
├── docker-compose.yml
├── .env
├── api/
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── main.py
│ └── models.py
├── nginx/
│ └── nginx.conf
└── data/
└── (SQLite DB will be created here)
The API Dockerfile
# api/Dockerfile
FROM python:3.11-slim
# Avoid interactive questions
ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1
WORKDIR /app
# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
# Copy and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the code
COPY . .
# Exposed port
EXPOSE 8000
# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
# Run the API
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Dependencies
# api/requirements.txt
fastapi==0.109.0
uvicorn[standard]==0.27.0
sqlalchemy==2.0.25
httpx==0.26.0
pydantic==2.5.3
The FastAPI API
# api/main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from sqlalchemy import create_engine, Column, Integer, String, DateTime, Text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime
import httpx
import os
app = FastAPI(title="My AI API", version="1.0.0")
# Database
DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///data/app.db")
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(bind=engine)
Base = declarative_base()
class Query(Base):
__tablename__ = "queries"
id = Column(Integer, primary_key=True)
question = Column(Text, nullable=False)
answer = Column(Text)
model = Column(String(100))
created_at = Column(DateTime, default=datetime.utcnow)
Base.metadata.create_all(engine)
# Schemas
class QuestionRequest(BaseModel):
question: str
model: str = "anthropic/claude-sonnet-4"
class QuestionResponse(BaseModel):
answer: str
model: str
query_id: int
@app.get("/health")
async def health():
return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()}
@app.post("/ask", response_model=QuestionResponse)
async def ask_question(req: QuestionRequest):
api_key = os.getenv("OPENROUTER_API_KEY")
if not api_key:
raise HTTPException(status_code=500, detail="API key not configured")
async with httpx.AsyncClient() as client:
response = await client.post(
"https://openrouter.ai/api/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}"},
json={
"model": req.model,
"messages": [{"role": "user", "content": req.question}],
"max_tokens": 1000
},
timeout=30.0
)
if response.status_code != 200:
raise HTTPException(status_code=502, detail="LLM API error")
answer = response.json()["choices"][0]["message"]["content"]
# Save to database
db = SessionLocal()
query = Query(question=req.question, answer=answer, model=req.model)
db.add(query)
db.commit()
query_id = query.id
db.close()
return QuestionResponse(answer=answer, model=req.model, query_id=query_id)
@app.get("/queries")
async def list_queries(limit: int = 10):
db = SessionLocal()
queries = db.query(Query).order_by(Query.created_at.desc()).limit(limit).all()
db.close()
return [{"id": q.id, "question": q.question[:100], "model": q.model, "date": q.created_at.isoformat()} for q in queries]
Nginx Configuration
# nginx/nginx.conf
events {
worker_connections 1024;
}
http {
upstream api {
server api:8000;
}
server {
listen 80;
server_name _;
# Request size limit
client_max_body_size 10M;
location / {
proxy_pass http://api;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts for AI requests (can be slow)
proxy_read_timeout 60s;
proxy_connect_timeout 10s;
}
location /health {
proxy_pass http://api/health;
access_log off;
}
}
}
The .env File
# .env (DO NOT commit this file!)
OPENROUTER_API_KEY=sk-or-v1-your-key-here
DB_PASSWORD=a-strong-password-here
Running Everything
# Create the data directory
mkdir -p data
# Run
docker compose up -d --build
# Verify
docker compose ps
# Test
curl http://localhost/health
curl -X POST http://localhost/ask \
-H "Content-Type: application/json" \
-d '{"question": "Explain Docker in one sentence"}'
🧠 Example 2: Containerized LLM Pipeline
Architecture
A typical LLM pipeline:
User Request
↓
[Container: API Gateway]
↓
[Container: Preprocessing]
- Text cleaning
- Intent extraction
↓
[Container: LLM Service]
- Model call
- Cache management
↓
[Container: Postprocessing]
- Response formatting
- Logging
↓
Response
Docker Compose Multi-Services
# docker-compose.yml
version: '3.8'
services:
gateway:
build: ./gateway
ports:
- "8080:8080"
environment:
- LLM_SERVICE_URL=http://llm:8001
- REDIS_URL=redis://cache:6379
depends_on:
cache:
condition: service_healthy
llm:
condition: service_healthy
restart: unless-stopped
llm:
build: ./llm-service
environment:
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
- REDIS_URL=redis://cache:6379
- DEFAULT_MODEL=anthropic/claude-sonnet-4
depends_on:
cache:
condition: service_healthy
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8001/health"]
interval: 15s
timeout: 5s
retries: 3
cache:
image: redis:7-alpine
volumes:
- redis_data:/data
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
worker:
build: ./worker
environment:
- REDIS_URL=redis://cache:6379
- DATABASE_URL=sqlite:///data/logs.db
volumes:
- ./data:/app/data
depends_on:
cache:
condition: service_healthy
restart: unless-stopped
volumes:
redis_data: