I Built a Local AI App with Docker Model Runner - Here’s How

Posted Oct 18, 2025 Updated Nov 7, 2025

By Techno Tim 4 min read

Run powerful AI models locally with almost zero setup - and wire them into your apps using Docker tools you already know.

📺 Watch the video

What Is Docker Model Runner?

Docker Model Runner is a standardized way to run AI models with Docker. Instead of juggling Python versions, CUDA, and random system packages, you pull and run a model much like you’d pull and run an image. You can use it from the CLI, in Docker Desktop, and inside your Compose files.

Highlights:

One command to run models (pulls on-demand)
OpenAI-compatible endpoints exposed locally
Works with Docker Compose alongside your app/database/cache
Local-first: low latency, easy iteration, real dev/prod parity

Prerequisites

Docker Desktop (Mac/Windows) or Docker CE (Linux)
(Optional) GPU drivers/toolkit if you plan to use GPU acceleration

If you’re on Docker Desktop and don’t see the model commands, update to the latest version and enable Docker Model in Settings.

Getting Started (CLI)

Pull models just like images - but use the new model keyword.

  
# Pull OpenAI’s open-weight model hosted by Docker
docker model pull ai/gpt-oss

# Pull something smaller
docker model pull ai/llama3.2

# See what’s local
docker model list

You can also run without pre-pulling:

docker model run ai/gpt-oss

This will pull the model if needed and start an interactive CLI where you can chat immediately. First response may take a moment while the model warms up.

Not a CLI person? The Models section in Docker Desktop lets you pull, run, and chat with models via the UI in just a couple clicks.

Using Docker Model Runner with Compose

Anywhere you can run Docker Compose, you can run models too. Here’s a real example pairing Open WebUI with Docker Model Runner:

  
services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - "3000:8080"
    environment:
      - OPENAI_API_BASE_URL=http://model-runner.docker.internal/engines/v1
      - OPENAI_API_KEY=''
      - WEBUI_NAME=Open-WebUI with Docker Model Runner
    volumes:
      - open-webui:/app/backend/data
    depends_on:
      - docker-model-runner

  docker-model-runner:
    provider:
      type: model
      options:
        model: ai/gpt-oss

volumes:
  open-webui:

Bring it up:

docker compose up

Open WebUI will be on http://localhost:3000, and it will talk to the model via the OpenAI-compatible URL set in OPENAI_API_BASE_URL. Swapping models later is just a one-line change to options.model.

Develop Locally with `techno-boto-chat`

I built a small Next.js chat app that speaks to any OpenAI-compatible backend - including Docker Model Runner - so you can prototype quickly with local models.

Check out techno-boto-chat on GitHub

  
# Clone and set up
git clone https://github.com/timothystewart6/techno-boto-chat.git
cd techno-boto-chat
yarn install  # or npm / pnpm

# Configure environment
cp .env.example .env.local

Set your API base + model in .env.local:

LLM_API_BASE_URL=http://localhost:12434/engines/v1
MODEL_NAME=ai/gpt-oss
OPENAI_API_KEY=   # optional for local models

Run app + model:

  
# Terminal A - start a local model or use Docker Desktop to start
docker model run ai/gpt-oss 

# Terminal B - run the web app
yarn dev
# then open http://localhost:3000

Type a prompt and you’ll see responses come back from your local model via Docker Model Runner. Want to try a different model? Change MODEL_NAME and restart.

Prefer everything in containers? Add the app as a Compose service and point it to model-runner.docker.internal just like the Open WebUI example.

Running with Docker Compose

The application is containerized for easy deployment:

  
# Build and deploy with Docker Compose
docker compose up --build -d

# Access your chat interface at http://localhost:3000
# Don't forget to switch your LLM_API_BASE_URL variable to http://model-runner.docker.internal/engines/v1 if you aren't exposing it on localhost:12434

Docker Desktop vs Docker CE Endpoints

Docker Desktop: http://model-runner.docker.internal/engines/v1
Docker CE (Linux): http://localhost:12434/engines/v1

Inside other containers, you may need to call the host at its bridge IP (e.g., 172.17.0.1:12434) if not using the Desktop hostname.

Troubleshooting

Model commands missing → update Docker Desktop and enable Docker Model in Settings
Slow first response → initial warm-up/load is expected; subsequent prompts are faster
App can’t reach model → verify OPENAI_API_BASE_URL (Desktop vs CE), port exposure, and network
Compose service ordering → ensure your app depends_on the model service
Can’t connect to localhost:12434 → ensure Enable host-side TCP support is enabled in Docker Desktop and you are not running inside of a container (that uses http://model-runner.docker.internal/engines/v1)

Join the conversation

What is Docker Model Runner - and why should you care? I show how I use @Docker Model Runner to run models locally and build a Next.js chat app.https://t.co/5iiGbf21F1
— Techno Tim (@TechnoTimLive) October 18, 2025

Links

🛍️ Check out the new Merch Shop at https://l.technotim.live/shop

⚙️ See all the hardware I recommend at https://l.technotim.live/gear

🚀 Don’t forget to check out the 🚀Launchpad repo with all of the quick start source files

🤝 Support me and help keep this site ad-free!

ai, self-hosted, dev

docker model-runner ai llm self-hosted docker-compose nextjs

This post is licensed under CC BY 4.0 by the author.