Docker + AI: Containerizing Your Intelligent Services

Self-Hosting 🟡 Intermediate ⏱️ 14 min read 📅 2026-02-24

Docker + AI: Containerizing Your Intelligent Services

You have a server running an API, a database, a reverse proxy, and maybe a language model. Everything is installed directly on the OS. One day, you update Python and everything breaks. Your API won't start, your dependencies are in conflict, and you spend 4 hours fixing everything.

With Docker, this scenario is a thing of the past. Each service runs in its isolated container, with its own dependencies, its own Python version, and its own environment. Update whatever you want — the other services remain unaffected.

In this guide, we'll containerize concrete AI services: a FastAPI API, an LLM pipeline, and the entire ecosystem around them. Real, clean, and reproducible self-hosting.

🐳 Why Docker for AI

The 4 Pillars

Pillar	Without Docker	With Docker
Isolation	Python 3.11 breaks your Python 3.9 app	Each service has its own version
Reproducibility	"It works on my machine"	Same image = same result everywhere
Scaling	Restart the entire server	Scale just the service that needs it
Deployment	47 commands to type in order	`docker compose up -d`

Why It's Particularly Important for AI

AI projects have heavy and conflicting dependencies:

# Project A wants:
torch==2.1.0
numpy==1.24.0
transformers==4.35.0

# Project B wants:
torch==1.13.0
numpy==1.21.0
tensorflow==2.14.0

Without Docker, it's a nightmare of virtual environments. With Docker:

# Project A
docker run -d --name project-a project-a:latest

# Project B (completely different versions, no conflicts)
docker run -d --name project-b project-b:latest

Docker vs VM

Aspect	Docker	VM (VirtualBox, etc.)
Startup	Seconds	Minutes
Size	MBs (image)	GBs (full OS)
Performance	Near-native	Virtualization overhead
Isolation	Processes	Full OS
AI Use Case	✅ Ideal	Overkill

📦 Docker Basics in 5 Minutes

Essential Vocabulary

Image     = The build plan (like a class)
Container = The running instance (like an object)
Dockerfile = The recipe to create an image
Volume    = Persistent storage (survives container restart)
Network   = Virtual network between containers
Compose   = Multi-container orchestration

Survival Commands

# See running containers
docker ps

# See ALL containers (including stopped ones)
docker ps -a

# See downloaded images
docker images

# Logs of a container
docker logs my-container
docker logs -f my-container  # Follow (real-time)

# Enter a container
docker exec -it my-container bash

# Stop / remove
docker stop my-container
docker rm my-container

# Clean up (unused images/containers)
docker system prune -a

Installing Docker

# Ubuntu/Debian
curl -fsSL https://get.docker.com | sh

# Add your user to the docker group (avoids sudo)
sudo usermod -aG docker $USER

# Install Docker Compose (included in Docker Desktop, otherwise :)
sudo apt install docker-compose-plugin

# Verify
docker --version
docker compose version

🏗️ Docker Compose: Orchestrating Multiple Services

Why Compose?

A typical AI project needs:
- An API (FastAPI, Flask)
- A database (SQLite, PostgreSQL)
- A reverse proxy (Nginx, Caddy)
- Maybe a cache (Redis)
- Maybe a worker (Celery, async tasks)

Managing all this with individual docker run commands is unmanageable. Docker Compose defines everything in a single file:

# docker-compose.yml
version: '3.8'

services:
  api:
    build: ./api
    ports:
      - "8000:8000"
    volumes:
      - ./data:/app/data
    environment:
      - DATABASE_URL=sqlite:///data/app.db
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
    depends_on:
      - db
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  db:
    image: postgres:16-alpine
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_USER=app
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app"]
      interval: 10s
      timeout: 5s
      retries: 5

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/certs:/etc/nginx/certs:ro
    depends_on:
      - api
    restart: unless-stopped

volumes:
  pgdata:

Essential Compose Commands

# Start all services
docker compose up -d

# See logs of all services
docker compose logs -f

# Logs of a specific service
docker compose logs -f api

# Restart a service
docker compose restart api

# Stop everything
docker compose down

# Stop everything AND remove volumes (WARNING: data loss)
docker compose down -v

# Rebuild after modifying the Dockerfile
docker compose up -d --build

🤖 Example 1: FastAPI + SQLite Containerized API

Project Structure

my-ai-api/
├── docker-compose.yml
├── .env
├── api/
│   ├── Dockerfile
│   ├── requirements.txt
│   ├── main.py
│   └── models.py
├── nginx/
│   └── nginx.conf
└── data/
    └── (SQLite DB will be created here)

The API Dockerfile

# api/Dockerfile
FROM python:3.11-slim

# Avoid interactive questions
ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1

WORKDIR /app

# Install system dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

# Copy and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the code
COPY . .

# Exposed port
EXPOSE 8000

# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# Run the API
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Dependencies

# api/requirements.txt
fastapi==0.109.0
uvicorn[standard]==0.27.0
sqlalchemy==2.0.25
httpx==0.26.0
pydantic==2.5.3

The FastAPI API

# api/main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from sqlalchemy import create_engine, Column, Integer, String, DateTime, Text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime
import httpx
import os

app = FastAPI(title="My AI API", version="1.0.0")

# Database
DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///data/app.db")
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(bind=engine)
Base = declarative_base()

class Query(Base):
    __tablename__ = "queries"
    id = Column(Integer, primary_key=True)
    question = Column(Text, nullable=False)
    answer = Column(Text)
    model = Column(String(100))
    created_at = Column(DateTime, default=datetime.utcnow)

Base.metadata.create_all(engine)

# Schemas
class QuestionRequest(BaseModel):
    question: str
    model: str = "anthropic/claude-sonnet-4"

class QuestionResponse(BaseModel):
    answer: str
    model: str
    query_id: int

@app.get("/health")
async def health():
    return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()}

@app.post("/ask", response_model=QuestionResponse)
async def ask_question(req: QuestionRequest):
    api_key = os.getenv("OPENROUTER_API_KEY")
    if not api_key:
        raise HTTPException(status_code=500, detail="API key not configured")

    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://openrouter.ai/api/v1/chat/completions",
            headers={"Authorization": f"Bearer {api_key}"},
            json={
                "model": req.model,
                "messages": [{"role": "user", "content": req.question}],
                "max_tokens": 1000
            },
            timeout=30.0
        )

    if response.status_code != 200:
        raise HTTPException(status_code=502, detail="LLM API error")

    answer = response.json()["choices"][0]["message"]["content"]

    # Save to database
    db = SessionLocal()
    query = Query(question=req.question, answer=answer, model=req.model)
    db.add(query)
    db.commit()
    query_id = query.id
    db.close()

    return QuestionResponse(answer=answer, model=req.model, query_id=query_id)

@app.get("/queries")
async def list_queries(limit: int = 10):
    db = SessionLocal()
    queries = db.query(Query).order_by(Query.created_at.desc()).limit(limit).all()
    db.close()
    return [{"id": q.id, "question": q.question[:100], "model": q.model, "date": q.created_at.isoformat()} for q in queries]

Nginx Configuration

# nginx/nginx.conf
events {
    worker_connections 1024;
}

http {
    upstream api {
        server api:8000;
    }

    server {
        listen 80;
        server_name _;

        # Request size limit
        client_max_body_size 10M;

        location / {
            proxy_pass http://api;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            # Timeouts for AI requests (can be slow)
            proxy_read_timeout 60s;
            proxy_connect_timeout 10s;
        }

        location /health {
            proxy_pass http://api/health;
            access_log off;
        }
    }
}

The .env File

# .env (DO NOT commit this file!)
OPENROUTER_API_KEY=sk-or-v1-your-key-here
DB_PASSWORD=a-strong-password-here

Running Everything

# Create the data directory
mkdir -p data

# Run
docker compose up -d --build

# Verify
docker compose ps

# Test
curl http://localhost/health
curl -X POST http://localhost/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "Explain Docker in one sentence"}'

🧠 Example 2: Containerized LLM Pipeline

Architecture

A typical LLM pipeline:

User Request
    ↓
[Container: API Gateway]
    ↓
[Container: Preprocessing]
  - Text cleaning
  - Intent extraction
    ↓
[Container: LLM Service]
  - Model call
  - Cache management
    ↓
[Container: Postprocessing]
  - Response formatting
  - Logging
    ↓
Response

Docker Compose Multi-Services

# docker-compose.yml
version: '3.8'

services:
  gateway:
    build: ./gateway
    ports:
      - "8080:8080"
    environment:
      - LLM_SERVICE_URL=http://llm:8001
      - REDIS_URL=redis://cache:6379
    depends_on:
      cache:
        condition: service_healthy
      llm:
        condition: service_healthy
    restart: unless-stopped

  llm:
    build: ./llm-service
    environment:
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
      - REDIS_URL=redis://cache:6379
      - DEFAULT_MODEL=anthropic/claude-sonnet-4
    depends_on:
      cache:
        condition: service_healthy
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8001/health"]
      interval: 15s
      timeout: 5s
      retries: 3

  cache:
    image: redis:7-alpine
    volumes:
      - redis_data:/data
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru

  worker:
    build: ./worker
    environment:
      - REDIS_URL=redis://cache:6379
      - DATABASE_URL=sqlite:///data/logs.db
    volumes:
      - ./data:/app/data
    depends_on:
      cache:
        condition: service_healthy
    restart: unless-stopped

volumes:
  redis_data:

#Containerization #DevOps #Docker #ia

📚 Related articles

Self-Hosting 🟢 Débutant 12 min

Rapid-MLX : the local AI engine 4.2x faster than Ollama on Apple Silicon

Discover Rapid-MLX, the local AI engine 4.2x faster than Ollama on Apple Silicon. Optimize your LLMs and unleash the full power of your Mac.

2026-06-15 18:01

Self-Hosting 🟢 Débutant 11 min

Best Ollama Models (June 2026)

Discover the June 2026 ranking of the best Ollama models. Benchmark & analysis of local LLMs (Qwen 3.6, DeepSeek V4) for your PC.

2026-06-15 05:03

Self-Hosting 🟢 Débutant 13 min

Best Lm Studio Models (June 2026)

Discover the best LM Studio models (June 2026) for every setup. Run local open source LLMs easily with no command line.

2026-06-15 04:02

📑 Table of contents

Docker + AI: Containerizing Your Intelligent Services

🐳 Why Docker for AI

The 4 Pillars

Why It's Particularly Important for AI

Docker vs VM

📦 Docker Basics in 5 Minutes

Essential Vocabulary

Survival Commands

Installing Docker

🏗️ Docker Compose: Orchestrating Multiple Services

Why Compose?

Essential Compose Commands

🤖 Example 1: FastAPI + SQLite Containerized API

Project Structure

The API Dockerfile

Dependencies

The FastAPI API

Nginx Configuration

The .env File

Running Everything

🧠 Example 2: Containerized LLM Pipeline

Architecture

Docker Compose Multi-Services

📚 Related articles

Rapid-MLX : the local AI engine 4.2x faster than Ollama on Apple Silicon

Best Ollama Models (June 2026)

Best Lm Studio Models (June 2026)