Move Ollama Models to different location

Ollama LLM model files take a lot of space

Page content

After installing ollama better to reconfigure ollama to store them in new place right away. So after we pull a new model, it doesn’t get downloaded to the old location.

Smart llama in the hat is doing something in the server room

About Ollama

Ollama is a text-based frontend to LLM AI Models and an API that can host those too. For a broader comparison of Ollama with vLLM, Docker Model Runner, LocalAI and cloud providers—including cost and infrastructure trade-offs—see LLM Hosting: Local, Self-Hosted & Cloud Infrastructure Compared.

Install Ollama

Goto https://ollama.com/download

To install Ollama on linux:

curl -fsSL https://ollama.com/install.sh | sh

Ollama on Windows is on page: https://ollama.com/download/windows Ollama for Mac is there too: https://ollama.com/download/macOllamaSetup.exe

Download, List and Remove Ollama models

To download some Ollama models: Go to Ollama Library (https://ollama.com/library) and find the model you need, there you can also find model tags and sizes.

Then run:

ollama pull gemma2:latest

# Or get slightly smarter one still nicely fitting into 16GB VRAM:
ollama pull gemma2:27b-instruct-q3_K_S

# Or: 
ollama pull llama3.1:latest
ollama pull llama3.1:8b-instruct-q8_0
ollama pull mistral-nemo:12b-instruct-2407-q6_K
ollama pull mistral-small:22b-instruct-2409-q4_0
ollama pull phi3:14b-medium-128k-instruct-q6_K
ollama pull qwen2.5:14b-instruct-q5_0

To check the models Ollama has in local repository:

ollama list

To remove some unneeded model:

ollama rm qwen2:7b-instruct-q8_0 # for example

Ollama Model location

By default the model files are stored:

  • Windows: C:\Users%username%.ollama\models
  • Linux: /usr/share/ollama/.ollama/models
  • macOS: ~/.ollama/models

The same storage concern shows up when Ollama runs in containers: you want models and server state on a named volume or bind mount (and optionally OLLAMA_MODELS when the layout should differ from the default /root/.ollama path in the official image). For a full Compose layout with GPU reservation, OLLAMA_HOST, upgrades, and rollback patterns, see Ollama in Docker Compose with GPU and Persistent Model Storage.

Configuring Ollama models path on Windows

To create an environment variable on Windows you can follow these instructions:

  • Open Windows Settings.
  • Go to System.
  • Select About
  • Select Advanced System Settings.
  • Go to the Advanced tab.
  • Select Environment Variables….
  • Click on New…
  • And create a variable called OLLAMA_MODELS pointing to where you want to store the models

Move Ollama models on Linux

Edit the ollama systemd service parameters

sudo systemctl edit ollama.service

or

sudo xed /etc/systemd/system/ollama.service

This will open an editor.

For each environment variable, add a line Environment under section [Service]:

[Service]
Environment="OLLAMA_MODELS=/specialplace/ollama/models"

Save and exit.

There is also a User and Group params, those must have access to this folder.

Reload systemd and restart Ollama:

sudo systemctl daemon-reload
sudo systemctl restart ollama

if something went wrong

systemctl status ollama.service
sudo journalctl -u ollama.service

Storing files on NTFS overhead

Please be aware that if you are running linux and keeping your models on NTFS formatted partition, your models would be loading much - more then 20% slower.

ntfs.mount cpu usage

Install Ollama on Windows to specific folder

Together with the models

.\OllamaSetup.exe /DIR=D:\OllamaDir

Expose Ollama API to internal network

Internal here means local network.

Add to service config:

[Service]
Environment="OLLAMA_HOST=0.0.0.0"

That exposes the plain HTTP API on your network. For automated HTTPS, correct streaming and WebSocket proxying through Caddy or Nginx, and edge hardening (timeouts, buffering, optional auth in front of the API), use Ollama behind a reverse proxy with Caddy or Nginx for HTTPS streaming.

To reach Ollama from remote devices without opening public ports, use a private overlay (Tailscale) or WireGuard; see Remote Ollama access via Tailscale or WireGuard, no public ports.

To see how Ollama fits with other local and cloud LLM options, check our LLM Hosting: Local, Self-Hosted & Cloud Infrastructure Compared guide.