AI/Ollama: Difference between revisions

Latest revision as of 04:14, 9 March 2026

curl -fsSL https://ollama.com/install.sh | sh
ollama pull gpt-oss:20b
ollama --version
ollama ls

curl -fsSL https://claude.ai/install.sh  | bash
ollama launch claude --model gpt-oss:20b

export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

export OLLAMA_NUM_CTX=32768
export OLLAMA_KEEP_ALIVE=5m
claude --model gpt-oss:20b

sudo usermod -aG render,video ${USER}
echo 'id -nG'|sudo -i -u ${USER} bash
newgrp render

Diagram

@startuml
autonumber
skinparam backgroundColor    transparent
skinparam DefaultFontName    Helvetica
skinparam actorStyle         awesome
skinparam ParticipantPadding 20
skinparam BoxPadding         10

title Claude ↔ Local FS ↔ Ollama ↔ GPT-OSS:20b

actor "Developer"                  as dev

box "Local Development PC" #LightBlue
    participant "Claude Code CLI"  as claude
    participant "Local Filesystem" as fs
end box

box "Kubernetes Cluster (K3s)" #Yellow
    participant "Ollama Service"   as ollama
    participant "GPT-OSS:20b"      as model
end box

dev      -> claude : Runs "claude --model gpt-oss:20b"
claude   -> fs     : Scans repository context
fs      --> claude : File contents / Git history

claude   -> ollama : POST /v1/messages (Anthropic API)
note right: Payload includes system prompt \nand local code context

ollama   -> model  : Load weights into GPU VRAM
model   --> ollama : Inference processing...

ollama -->> claude : Streamed Response (Tokens)
claude   -> dev    : Displays suggested code changes

@enduml

Sturcture

@startsalt
skinparam backgroundColor transparent
skinparam defaultFontName monospaced
{
{T-
+**/**                       | Root File System
++**/usr/local/bin/**        | Executable Binaries
+++ollama                    | Ollama Server (Standalone)
++**/usr/share/ollama/**     | Ollama Shares (Resources)
+++**.ollama/**              | Ollama Shares
++++id_ed25519               | Private Key
++++id_ed25519.pub           | Public  Key
++++**models/**              | Saved Models
+++++blobs/                  | Weights **(gpt-oss:20b)**
+++++manifests/              | Model metadata
++**/etc/systemd/system/**   | Systemd Services
+++ollama.service            | Systemd service file
++**/home/$USER/**           | User's Home Directory
+++**.local/bin/**           | User's Executable Binaries
++++claude                   | Claude Code CLI
+++**.ollama/**              | Ollama Data Directory
++++id_ed25519               | Private Key
++++id_ed25519.pub           | Public  Key
+++**.claude/**              | Claude Code Data Directory
++++config.json              | API URL, keys, project context
++++memory/                  | Persistent memory
+++**my-project/**           | Your development folder
++++.claude/                 | Project specific settings
++++CLAUDE.md                | Guidebook for current project
}
}
@endsalt

Optimization

Variable	Value	Impact
`OLLAMA_FLASH_ATTENTION`	`1`	Reduces memory usage and speeds up processing for long code files. Highly recommended for coding.
`OLLAMA_KV_CACHE_TYPE`	`q8_0` or `q4_0`	Compresses the short-term memory cache. `q8_0` saves space with almost no quality loss; `q4_0` saves even more space.
`OLLAMA_NUM_PARALLEL`	`1`	Crucial for 32GB RAM. Limits Ollama to one task at a time to prevent Out of Memory crashes when using a 20B model.
`OLLAMA_KEEP_ALIVE`	`30m`	Keeps the `20B` model in your RAM for `30` minutes after use so you don't have to wait `20` seconds for it to reload every time.
`OLLAMA_NUM_CTX`	`6384` to `32768`	The most important. Controls the brain capacity. `32k` is standard for Claude Code but uses `~3GB` more RAM than the default `4k`.
`OLLAMA_NUM_GPU`	`999`	Forces Ollama to offload as many layers as possible to your Intel Arc iGPU instead of the slower CPU.
`OLLAMA_NUM_GPU`	`0`	Forces Ollama CPU-only for all models
`CUDA_VISIBLE_DEVICES`	`-1`	Hides any GPU from the Ollama process
`ROCR_VISIBLE_DEVICES`	`-1`	Hides any GPU from the Ollama process
`OLLAMA_INTEL_GPU`	`1`	Forces Ollama to use Intel Arc GPU

Installation

Yoga Pro 7i (G9 + U7 155H + 32GB + 1TB)

Ollama » Install

curl -fsSL https://ollama.com/install.sh | sh
:'
>>> Installing ollama to /usr/local
[sudo] password for shahed: 
>>> Downloading ollama-linux-amd64.tar.zst
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
WARNING: No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode.
'

sudo systemctl daemon-reload 
sudo systemctl restart  ollama.service 
sudo systemctl status   ollama.service

cat /etc/systemd/system/ollama.service
setsid open http://127.0.0.1:11434 >/dev/null 2>&1 &

Ollama » ArcGPU
sudo journalctl -u ollama --no-pager \| grep -i "sycl\\|intel\\|gpu" sudo apt update && sudo apt install intel-gpu-tools sudo intel_gpu_top
cat <<'EXE' \| sudo bash apt-get update;echo apt-get install -y apt-transport-https ca-certificates gnupg build-essential apt-get install -y software-properties-common git curl file procps libfuse2 apt-get clean;sleep 5 EXE
Intel	Ubuntu
curl -fsSL https://repositories.intel.com/gpu/intel-graphics.key \ \| sudo tee /etc/apt/keyrings/intel-graphics.asc >/dev/null DISTRIBUTION=$(. /etc/os-release && echo "${VERSION_CODENAME}") cat << SRC \| sudo tee /etc/apt/sources.list.d/intel-graphics.list >/dev/null deb [arch=$(dpkg --print-architecture)\ signed-by=/etc/apt/keyrings/intel-graphics.asc]\ https://repositories.intel.com/gpu/ubuntu ${DISTRIBUTION} unified SRC cat <<'EXE' \| sudo bash apt-get update;echo apt-get upgrade -y;echo apt-get install -y intel-gpu-tools apt-get install -y intel-opencl-icd libze1 libze-intel-gpu1 apt-get install -y clinfo intel-media-va-driver-non-free apt-get clean EXE	cat <<'EXE' \| sudo bash rm /etc/apt/sources.list.d/intel-graphics.list rm /etc/apt/keyrings/intel-graphics.asc EXE cat <<'EXE' \| sudo bash apt-get update;echo apt-get upgrade -y;echo apt-get purge -y intel-gpu-tools apt-get purge -y intel-opencl-icd libze1 libze-intel-gpu1 apt-get install -y clinfo intel-media-va-driver-non-free apt-get clean EXE cat <<'EXE' \| sudo bash apt-get update;echo apt-get upgrade -y;echo apt-get install -y intel-opencl-icd libze-intel-gpu1 libze1 apt-get install -y intel-media-va-driver-non-free clinfo apt-get install -y intel-gpu-tools apt-get clean EXE
ollama ps ls /dev/dri	sudo intel_gpu_top clinfo \| grep 'Device Name'

Ollama » Config

cat <<'INI' | sudo tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target

[Service]
User=ollama
Group=ollama

# Optimizations for Yoga Pro 7i (16GB Target)
Environment="OLLAMA_KV_CACHE_TYPE=q4_0"
Environment="OLLAMA_FLASH_ATTENTION=1"
Environment="OLLAMA_NUM_PARALLEL=1"
Environment="OLLAMA_NUM_CTX=16384"
Environment="OLLAMA_KEEP_ALIVE=5m"
Environment="OLLAMA_NUM_GPU=999"

# Intel optimizations if available
# Environment="OLLAMA_INTEL_GPU=1"

Environment="PATH=/home/shahed/.nvm/versions/node/v18.20.7/bin:/home/shahed/.pyenv/plugins/pyenv-virtualenv/shims:/home/shahed/.pyenv/shims:/home/shahed/.pyenv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/home/shahed/.local/share/JetBrains/Toolbox/scripts"

ExecStart=/usr/local/bin/ollama serve
Restart=always
RestartSec=3

[Install]
WantedBy=default.target
INI

cat <<'EXE' | sudo bash
systemctl daemon-reload
systemctl restart ollama.service
systemctl status  ollama.service
EXE

Ollama » Model

ollama pull gpt-oss:20b
:'
pulling manifest 
pulling e7b273f96360: 100% ▕███████████████████████████████████████████▏  13 GB
pulling fa6710a93d78: 100% ▕███████████████████████████████████████████▏ 7.2 KB
pulling f60356777647: 100% ▕███████████████████████████████████████████▏  11 KB
pulling d8ba2f9a17b3: 100% ▕███████████████████████████████████████████▏   18 B
pulling 776beb3adb23: 100% ▕███████████████████████████████████████████▏  489 B
verifying sha256 digest 
writing manifest 
success 
'
ollama ls

Claude » Install

curl -fsSL https://claude.ai/install.sh  | bash
:'
Setting up Claude Code...

✔ Claude Code successfully installed!
  Version: 2.1.63
  Location: ~/.local/bin/claude
  Next: Run claude --help to get started

⚠ Setup notes:
  • Native installation exists but ~/.local/bin is not in your PATH. Run:
  echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc
✅ Installation complete!
'

cat <<'ENV' | tee -a ${HOME}/.bashrc >/dev/null

# claude config
export PATH="$HOME/.local/bin:$PATH" >> ~/.bashrc
ENV
source ~/.bashrc && sleep 1
claude --version

Claude » Config

export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

export ANTHROPIC_BASE_URL=http://shahed-ap.local:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

claude --model gpt-oss:20b
ollama stop    gpt-oss:20b
ollama ps

Legion 5 Pro (G6 + R7 5800H + 32GB + 1TB)

Ollama » Install

curl -fsSL https://ollama.com/install.sh | sh
:'
>>> Installing ollama to /usr/local
[sudo] password for shahed: 
>>> Downloading ollama-linux-amd64.tar.zst
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> NVIDIA GPU installed.
'

cat <<'EXE' | sudo bash
systemctl daemon-reload
systemctl enable  ollama.service
systemctl restart ollama.service
systemctl status  ollama.service
EXE

cat /etc/systemd/system/ollama.service
setsid open http://127.0.0.1:11434 >/dev/null 2>&1 &

Ollama » Utility

sudo journalctl -u ollama --no-pager | grep -i "sycl\|intel\|gpu"

cat <<'EXE' | sudo bash
apt-get update;echo
apt-get install -y apt-transport-https ca-certificates gnupg build-essential
apt-get install -y software-properties-common git curl file procps libfuse2t64
apt-get clean;sleep 5
EXE

Ollama » Config

cat <<'INI' | sudo tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target

[Service]
User=ollama
Group=ollama

# Ollama listens on any host
Environment="OLLAMA_HOST=0.0.0.0"

Environment="PATH=/home/linuxbrew/.linuxbrew/Cellar/pyenv-virtualenv/1.2.3/shims:/home/shahed/.pyenv/shims:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/shahed/.rbenv/shims:/home/shahed/.rbenv/bin:/home/shahed/.cargo/bin:/home/shahed/.nvm/versions/node/v18.12.1/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/cli/sencha/cmd:/opt/cli/sencha/cmd:/home/shahed/.local/share/JetBrains/Toolbox/scripts:/home/shahed/.local/share/JetBrains/Toolbox/scripts"

ExecStart=/usr/local/bin/ollama serve
Restart=always
RestartSec=3

[Install]
WantedBy=default.target
INI

cat <<'EXE' | sudo bash
systemctl daemon-reload
systemctl restart ollama.service
systemctl status  ollama.service
EXE

cat <<'EXE' | sudo bash
ufw allow from 10.19.80.0/22   to any port 11434 proto tcp
ufw allow from 10.20.0.0/24    to any port 11434 proto tcp
ufw allow from 10.20.13.0/24   to any port 11434 proto tcp
ufw allow from 10.20.40.0/24   to any port 11434 proto tcp
ufw allow from 192.168.49.2/32 to any port 11434 proto tcp
EXE

telnet 10.20.40.16     11434
telnet 192.168.49.2    11434
telnet shahed-ac.local 11434

Ollama » Model

ollama pull gpt-oss:20b
:'
pulling manifest
pulling e7b273f96360: 100% ▕███████████████████████████████████████████▏  13 GB
pulling fa6710a93d78: 100% ▕███████████████████████████████████████████▏ 7.2 KB
pulling f60356777647: 100% ▕███████████████████████████████████████████▏  11 KB
pulling d8ba2f9a17b3: 100% ▕███████████████████████████████████████████▏   18 B
pulling 776beb3adb23: 100% ▕███████████████████████████████████████████▏  489 B
verifying sha256 digest
writing manifest
success
'
ollama ls

Claude » Install

curl -fsSL https://claude.ai/install.sh  | bash
:'
Setting up Claude Code...
✔ Claude Code successfully installed!
  Version: 2.1.71
  Location: ~/.local/bin/claude
  Next: Run claude --help to get started
⚠ Setup notes:
  • Native installation exists but ~/.local/bin is not in your PATH. Run:
  echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc
✅ Installation complete!
'

cat <<'ENV' | tee -a ${HOME}/.bashrc >/dev/null

# claude config
export PATH="$HOME/.local/bin:$PATH" >> ~/.bashrc
ENV
source ~/.bashrc && sleep 1
claude --version

Claude » Config

export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

export ANTHROPIC_BASE_URL=http://shahed-ac.local:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

claude --model gpt-oss:20b
ollama stop    gpt-oss:20b
ollama ps

ASUS NUC 15 Pro (L10 + U7 255H + 64GB + 2TB)

Ollama » Install

curl -fsSL https://ollama.com/install.sh | sh
:'
>>> Installing ollama to /usr/local
[sudo] password for shahed: 
>>> Downloading ollama-linux-amd64.tar.zst
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
WARNING: systemd is not running
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
WARNING: No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
'

cat <<'EXE' | sudo bash
systemctl daemon-reload
systemctl enable  ollama.service
systemctl restart ollama.service
systemctl status  ollama.service
EXE

cat /etc/systemd/system/ollama.service
setsid open http://127.0.0.1:11434 >/dev/null 2>&1 &

Ollama » ArcGPU
sudo journalctl -u ollama --no-pager \| grep -i "sycl\\|intel\\|gpu" sudo intel_gpu_top
cat <<'EXE' \| sudo bash apt-get update;echo apt-get install -y apt-transport-https ca-certificates gnupg build-essential apt-get install -y software-properties-common git curl file procps libfuse2t64 apt-get clean;sleep 5 EXE
Intel	Ubuntu
curl -fsSL https://repositories.intel.com/gpu/intel-graphics.key \ \| sudo tee /etc/apt/keyrings/intel-graphics.asc >/dev/null DISTRIBUTION=$(. /etc/os-release && echo "${VERSION_CODENAME}") cat << SRC \| sudo tee /etc/apt/sources.list.d/intel-graphics.list >/dev/null deb [arch=$(dpkg --print-architecture)\ signed-by=/etc/apt/keyrings/intel-graphics.asc]\ https://repositories.intel.com/gpu/ubuntu ${DISTRIBUTION} unified SRC cat <<'EXE' \| sudo bash apt-get update;echo apt-get upgrade -y;echo apt-get install -y intel-gpu-tools apt-get install -y intel-opencl-icd libze1 libze-intel-gpu1 apt-get install -y clinfo intel-media-va-driver-non-free apt-get clean EXE	cat <<'EXE' \| sudo bash rm /etc/apt/sources.list.d/intel-graphics.list rm /etc/apt/keyrings/intel-graphics.asc EXE cat <<'EXE' \| sudo bash apt-get update;echo apt-get upgrade -y;echo apt-get purge -y intel-gpu-tools apt-get purge -y intel-opencl-icd libze1 libze-intel-gpu1 apt-get install -y clinfo intel-media-va-driver-non-free apt-get clean EXE cat <<'EXE' \| sudo bash apt-get update;echo apt-get upgrade -y;echo apt-get install -y intel-opencl-icd libze-intel-gpu1 libze1 apt-get install -y intel-media-va-driver-non-free clinfo apt-get install -y intel-gpu-tools apt-get clean EXE
ollama ps ls /dev/dri	sudo intel_gpu_top clinfo \| grep 'Device Name'

Ollama » Config

cat <<'INI' | sudo tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target

[Service]
User=ollama
Group=ollama

# Optimizations for ASUS NUC 15 Pro (32GB Target)
Environment="OLLAMA_NUM_CTX=32768"
Environment="OLLAMA_NUM_GPU=999"
Environment="OLLAMA_VULKAN=1"

# Intel optimizations if available
Environment="OLLAMA_INTEL_GPU=1"

# Ollama listens on any host
Environment="OLLAMA_HOST=0.0.0.0"

Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"

ExecStart=/usr/local/bin/ollama serve
Restart=always
RestartSec=3

[Install]
WantedBy=default.target
INI

cat <<'EXE' | sudo bash
systemctl daemon-reload
systemctl restart ollama.service
systemctl status  ollama.service
EXE

cat <<'EXE' | sudo bash
ufw allow from 10.19.80.0/22   to any port 11434 proto tcp
ufw allow from 10.20.0.0/24    to any port 11434 proto tcp
ufw allow from 10.20.13.0/24   to any port 11434 proto tcp
ufw allow from 10.20.40.0/24   to any port 11434 proto tcp
ufw allow from 192.168.49.2/32 to any port 11434 proto tcp
EXE

telnet 10.20.40.16     11434
telnet 192.168.49.2    11434
telnet shahed-ap.local 11434

Ollama » Model

ollama pull gpt-oss:20b
:'
pulling manifest
pulling e7b273f96360: 100% ▕███████████████████████████████████████████▏  13 GB
pulling fa6710a93d78: 100% ▕███████████████████████████████████████████▏ 7.2 KB
pulling f60356777647: 100% ▕███████████████████████████████████████████▏  11 KB
pulling d8ba2f9a17b3: 100% ▕███████████████████████████████████████████▏   18 B
pulling 776beb3adb23: 100% ▕███████████████████████████████████████████▏  489 B
verifying sha256 digest
writing manifest
success
'
ollama ls

ollama pull qwen3-coder:30b
:'
pulling manifest
pulling 1194192cf2a1: 100% ▕███████████████████████████████████████████▏  18 GB
pulling d18a5cc71b84: 100% ▕███████████████████████████████████████████▏  11 KB
pulling 69aa441ea44f: 100% ▕███████████████████████████████████████████▏  148 B
pulling 24a94682582c: 100% ▕███████████████████████████████████████████▏  542 B
verifying sha256 digest
writing manifest
success
'

ollama ls

Claude » Install

curl -fsSL https://claude.ai/install.sh  | bash
:'
Setting up Claude Code...
✔ Claude Code successfully installed!
  Version: 2.1.71
  Location: ~/.local/bin/claude
  Next: Run claude --help to get started
⚠ Setup notes:
  • Native installation exists but ~/.local/bin is not in your PATH. Run:
  echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc
✅ Installation complete!
'

cat <<'ENV' | tee -a ${HOME}/.bashrc >/dev/null

# claude config
export PATH="$HOME/.local/bin:$PATH" >> ~/.bashrc
ENV
source ~/.bashrc && sleep 1
claude --version

Claude » Config

export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

export ANTHROPIC_BASE_URL=http://shahed-ap.local:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

export ANTHROPIC_BASE_URL=http://10.20.40.16:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=''

claude --model qwen3-coder:30b
claude --model gpt-oss:20b

ollama stop qwen3-coder:30b
ollama stop gpt-oss:20b

References

References
AI » Ollama » Claude Code AI » Ollama » Search AI » Cluster » Exo AI » Codex CLI AI » Ollama
AI » Model » `lfm2.5-thinking` AI » Model » `lfm2` AI » Model » `translategemma` AI » Model » `minimax-m2.5` AI » Model » `ministral-3` AI » Model » `granite4` AI » Model » `glm-4.7-flash` AI » Model » `glm-4.7` AI » Model » `glm-ocr` AI » Model » `glm-5`	AI » Model » `qwen3-coder-next` AI » Model » `qwen3-embedding` AI » Model » `qwen3-coder` AI » Model » `qwen3-vl` AI » Model » `qwen3.5` AI » Model » `gpt-oss:120b` AI » Model » `gpt-oss:20b` AI » Model » `gpt-oss`
K8s » Configure Service Accounts for Pods K8s » Restart Pods With Kubectl K8s » Interactive Pod K8s » Restart Pods Docker » Makefile Kubernetes Minikube Kubectl CURL K8s	K8s » `kubectl rollout` K8s » CSI Hostpath Driver Security » Password K8s » Storage K8s » Ingress K8s » Service K8s » Run MinIO CIDR UFW	Bind9 » Authoritative DNS Server Bind9 » Secondary DNS Server Minikube » Ingress DNS WiFi » DWR-BE7200G Minikube » Systemd Minikube » MetalLB Minikube » Registry Minikube » Tunnel Localtunnel ZA Proxy

AI/Ollama: Difference between revisions

Latest revision as of 04:14, 9 March 2026

Optimization

Installation

References

Navigation menu

Search