AI/Ollama: Difference between revisions

From Chorke Wiki
Jump to navigation Jump to search
No edit summary
Line 105: Line 105:


==Optimization==
==Optimization==
{|class='wikitable mw-collapsible'
!scope='col' style='text-align:left'|
Optimization
|-
|valign='top'|
{|class='wikitable mw-collapsible'
{|class='wikitable mw-collapsible'
!scope='col' style='text-align:left' colspan='3'|
!scope='col' style='text-align:left' colspan='3'|
Optimization
Yoga Pro 7i
|-
|-
!scope='col'| Variable
!scope='col'| Variable
!scope='col'| Value (Range)
!scope='col'| Value
!scope='col'| Impact on Yoga Pro 7i
!scope='col'| Impact
|-
|-
!scope='row' style='text-align:left'      | <code>OLLAMA_NUM_CTX</code>
!scope='row' style='text-align:left'      | <code>OLLAMA_NUM_CTX</code>
| <code>6384</code> to <code>32768</code> || '''The most important.''' Controls the "brain capacity." 32k is standard for Claude Code but uses ~3GB more RAM than the default 4k.
| <code>6384</code> to <code>32768</code> || '''The most important.''' Controls the "brain capacity." 32k is standard for Claude Code but uses ~3GB more RAM than the default 4k.
|}
|}
|}



Revision as of 01:11, 1 March 2026

curl -fsSL https://ollama.com/install.sh | sh
ollama pull model gpt-oss:20b
ollama --version
ollama ls

curl -fsSL https://claude.ai/install.sh  | bash
ollama launch claude --model gpt-oss:20b
export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export OLLAMA_NUM_CTX=32768
export OLLAMA_KEEP_ALIVE=5m

claude --model gpt-oss:20b

Diagram

@startuml
autonumber
skinparam backgroundColor    transparent
skinparam DefaultFontName    Helvetica
skinparam actorStyle         awesome
skinparam ParticipantPadding 20
skinparam BoxPadding         10

title Claude ↔ Local FS ↔ Ollama ↔ GPT-OSS:20b

actor "Developer"                  as dev

box "Local Development PC" #LightBlue
    participant "Claude Code CLI"  as claude
    participant "Local Filesystem" as fs
end box

box "Kubernetes Cluster (K3s)" #Yellow
    participant "Ollama Service"   as ollama
    participant "GPT-OSS:20b"      as model
end box

dev      -> claude : Runs "claude --model gpt-oss:20b"
claude   -> fs     : Scans repository context
fs      --> claude : File contents / Git history

claude   -> ollama : POST /v1/messages (Anthropic API)
note right: Payload includes system prompt \nand local code context

ollama   -> model  : Load weights into GPU VRAM
model   --> ollama : Inference processing...

ollama -->> claude : Streamed Response (Tokens)
claude   -> dev    : Displays suggested code changes

@enduml

Sturcture

@startsalt
skinparam backgroundColor transparent
skinparam defaultFontName monospaced
{
{T-
+/                           | Root File System
++**/usr/local/bin/**        | Executive Binaries
+++ollama                    | Ollama Server (Standalone Binary)
+++claude                    | Claude Code CLI
++**/etc/systemd/system/**   | Services
+++ollama.service            | Systemd service file
++**/var/lib/claude-code/**  | Native installation files (Global)
++**/home/<user>/**          | User's Home Directory
+++**.ollama/**              | Ollama Data Directory
++++history                  | CLI Chat History
++++**models/**              | Saved Models
+++++blobs/                  | Weights **(gpt-oss:20b)**
+++++manifests/              | Model metadata
+++**.claude/**              | Claude Code Data Directory
++++config.json              | API URL, keys, project context
++++memory/                  | Persistent memory
+++**my-project/**           | Your development folder
++++.claude/                 | Project specific settings
++++CLAUDE.md                | Guidebook for current project
}
}
@endsalt

Optimization

Optimization

Yoga Pro 7i

Variable Value Impact
OLLAMA_NUM_CTX 6384 to 32768 The most important. Controls the "brain capacity." 32k is standard for Claude Code but uses ~3GB more RAM than the default 4k.

References

References