Skip to content

Running with Docker

This project ships with a GPU-ready Docker image so you can run everything in a consistent environment without installing all dependencies on your host.

Below is how to:

  1. understand what Docker is,
  2. install Docker,
  3. create the local folders that will be mounted into the container,
  4. run the container (Windows/PowerShell and Linux/macOS),
  5. switch between the two modes the image supports (app and shell).

1. What is Docker?

Docker lets you run apps in containers: lightweight, isolated environments that bundle the OS libraries and dependencies your app needs. You get the same setup everywhere (your laptop, CI, a server), so “it works on my machine” stops being a problem.

In this repo, the image is built on top of an NVIDIA CUDA runtime image and already contains:

  • Python 3.11
  • your package (installed with pip install -e .)
  • Ollama (started automatically in the container)
  • an entrypoint script that can start the Streamlit app or drop you into a shell

2. Install Docker

Desktop users (Windows / macOS):

  • Install Docker Desktop from the official Docker site.
  • On Windows, make sure you can run docker from PowerShell.
  • If you want GPU support on Windows, you also need a recent NVIDIA driver and the Docker + WSL2 stack that supports GPU.

Linux users (Ubuntu, etc.):

  • Install the Docker Engine from your distro or from Docker’s official docs.
  • For GPU support, install the NVIDIA Container Toolkit so --gpus all works.

If --gpus all fails, check your driver/toolkit install.


3. Create local folders

The container expects to mount several folders from your host into /app/... inside the container. Create them once:

mkdir -p data examples tasks output ollama_models

These will map to:

  • ./data/app/data
  • ./examples/app/examples
  • ./tasks/app/tasks
  • ./output/app/output
  • ./ollama_models/root/.ollama (optional, but recommended)

Anything the app writes there will persist on your machine.

By default, Ollama models are stored inside the container at /root/.ollama. This means every time you start a new container, you'll need to pull the models again.

To avoid this, mount a local directory to persist the Ollama models between container runs:

mkdir -p ollama_models

Then add -v ${PWD}/ollama_models:/root/.ollama (Windows/PowerShell) or -v $(pwd)/ollama_models:/root/.ollama (Linux/macOS) to your docker run command (see examples in section 4).


4. Run the container

4.1 Windows / PowerShell example

# Remove `--gpus all` if you don't have a GPU
docker run --rm --gpus all `
  -p 127.0.0.1:8501:8501 `
  -p 11434:11434 `
  -v ${PWD}/data:/app/data `
  -v ${PWD}/examples:/app/examples `
  -v ${PWD}/tasks:/app/tasks `
  -v ${PWD}/output:/app/output `
  -v ${PWD}/ollama_models:/root/.ollama `
  lmmasters/llm_extractinator:latest

4.2 Linux / macOS variant

# Remove `--gpus all` if you don't have a GPU
docker run --rm --gpus all \
  -p 127.0.0.1:8501:8501 \
  -p 11434:11434 \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/examples:/app/examples \
  -v $(pwd)/tasks:/app/tasks \
  -v $(pwd)/output:/app/output \
  -v $(pwd)/ollama_models:/root/.ollama \
  lmmasters/llm_extractinator:latest

Open: http://127.0.0.1:8501


5. The two modes (from the Dockerfile)

Your Dockerfile defines an entrypoint script:

  • it always starts Ollama in the background: ollama serve ...
  • it then looks at the first argument to decide the mode
/entrypoint.sh app   # default
/entrypoint.sh shell # drop into a shell

So, by default, when you run:

docker run ... lmmasters/llm_extractinator:latest

it uses CMD ["app"] → starts the Streamlit “extractinator”.

If you want to drop into the container and poke around (with the package already installed and Ollama running), just pass shell at the end:

Windows / PowerShell:

docker run --rm --gpus all `
  -p 127.0.0.1:8501:8501 `
  -p 11434:11434 `
  -v ${PWD}/data:/app/data `
  -v ${PWD}/examples:/app/examples `
  -v ${PWD}/tasks:/app/tasks `
  -v ${PWD}/output:/app/output `
  -v ${PWD}/ollama_models:/root/.ollama `
  lmmasters/llm_extractinator:latest shell

Linux / macOS:

docker run --rm --gpus all \
  -p 127.0.0.1:8501:8501 \
  -p 11434:11434 \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/examples:/app/examples \
  -v $(pwd)/tasks:/app/tasks \
  -v $(pwd)/output:/app/output \
  -v $(pwd)/ollama_models:/root/.ollama \
  lmmasters/llm_extractinator:latest shell

That will not start the Streamlit UI; instead you’ll get a bash shell inside the container with llm_extractinator installed.


6. Notes

  • The image exposes two ports: 8501 (Streamlit) and 11434 (Ollama).
  • If you don’t have a GPU, you can try omitting --gpus all, but the image is CUDA-based, so GPU is the intended path.
  • If your Docker Desktop uses different volume mappings (e.g., Windows drive letters), adjust the -v paths accordingly.