⚙ Personal Lab Notes

Ollama Guides

Local LLM inference and private RAG pipelines. Everything runs on the network — nothing leaves it.

Ollama 0.18.x Ubuntu 24.04 LTS Open WebUI Chroma
Guides

Local LLM + RAG Pipeline

✓ Working

Deploy Ollama on a Proxmox Ubuntu VM, expose the API to the local network, stand up Open WebUI and Chroma, and build a full RAG ingestion pipeline that indexes your files locally. CPU-only Phase 1 — GPU passthrough in Phase 2.

ollama open-webui chroma rag nomic-embed-text docker ubuntu 24.04

Model Reference & Catalog

✓ Current

Curated reference for models available through Ollama — installed stack, RAM requirements, use cases, and notes on privacy. Chat, code, reasoning, embedding, and vision models covered.

llama3.1:70b mistral:7b codellama deepseek-coder nomic-embed-text

GPU-Accelerated Inference — Vega 56

✓ Complete

AMD Vega 56 GPU passthrough via Thunderbolt 4 eGPU. ROCm deprecation workaround, Vulkan backend for Ollama, and the full vendor-reset story. Documented in the Proxmox lab.

gpu-passthrough vega 56 vulkan rocm thunderbolt 4

Stack

Ollama
0.18.x
Embedding
nomic-embed-text
Vector DB
Chroma (v2 API)
Frontend
Open WebUI
VM OS
Ubuntu 24.04.4 LTS
Inference
Vulkan · Vega 56 (85 tok/s)