ZERO TRUST · WIREGUARD · E2E ENCRYPTEDFRONTEND ENTRYNETWORKAPP RUNTIMEORCHESTRATIONKNOWLEDGESTORAGE / EXTERNAL APIsINFERENCEHARDWAREOBSERVABILITYNUCLEUSGENTOOFOO FRONTEND · WIREFRAME-DRIVEN UIWireframes define layout, structure, and component hierarchy.HTTPSVPN:443managedorchestrateworkflowsembedRAGricher contextKY grows → smarteragentsdeposittrigger① default① agents② fallbackinferencestoresfetch modelrecipe→junkGPU recipeCPU recipepull modelserves③ HF freedatasetsCUDAGPUinferinferdispatchSSH→WSLauto-deposittunneltunnelpublishsubscribeeventstasksdepositsall topicspipelinemetrics:9200:9210datasourcemetricslogsdashboardsheartbeatmetricslogstracesorchestratereasontracesspansgentoofooFrontend⚡ Traefik🔒 Tailscale VPN🔐 RBAC/Auth▲ Next.js 15🔄 PM2🦜 LangChain.js⚙️ n8n Workflows🧮 Embeddings🗑️ KnowledgeKnowledge Yard🧪 Distillery⚡ NVIDIA GPUNIM API🐘 pgvector⚡ Kafka(Redpanda)🤗 HuggingFaceHub🔀 OpenRouter(free tier)🚀 LMDeployLocal GPU🦅 ZeroClaw守護Agents🖥 LMDeploy💙 Intel CPUOpenVINO🎮 NVIDIA 309024 GB🧬 GPU ResearchPipeline📺 tmuxDispatcher🔥 Prometheus📋 Loki📊 Grafana🚗 TeslaMate📡 GLTGrafana+Loki+Tempo🧠 OpencodeNucleus🔍 Tempo⚡ MESSAGE BUS · 伝言バス · CLICK TO EXPLORE ⚡LANGCHAIN.JS · VLLM · N8N · PGVECTOR · NEXT.JS 15 · NVIDIA NIM · HUGGINGFACE HUB · ZEROCLAW
Live · LMDeploy + LangChain.js + pgvector · Local-first

Your End to End
AI Stack

A Platform for exploring the world of AI with a living and evolving knowledge base that learns as you feed it.

LangChain RAG
87tok/s
RTX 3090 inference
32kctx
Context window
6+models
Local models
100%private
Local-first, cloud-optional
memory
pgvector knowledge

The Full Stack

Six layers. Local-first, cloud-optional.

🦜
LangChain Orchestration
Full agent pipelines with RAG, memory, and tool use. Every query routes through LangChain.js connecting your local models to the Knowledge Yard.
RAGAgentsMemoryTools
🗑️
Knowledge Yard
A living pgvector archive where agents deposit everything they learn. Drag-and-drop any file. The Oracle synthesizes wisdom on demand.
pgvectorSemantic SearchAuto-depositOracle
🖥
Local LLM Inference
LMDeploy on an NVIDIA RTX 3090 via Tailscale VPN. 87 tokens/sec. Local-first inference with optional cloud provider fallback.
LMDeployRTX 3090PrivateTailscale
🦅
ZeroClaw Agents
Multi-provider agent routing: NVIDIA → OpenRouter → LMDeploy fallback chain. Agents browse, reason, and deposit findings automatically.
Multi-providerRoutingAutonomousOpenRouter
🧪
Autoresearch Pipeline
Automated experiment runner that dispatches training jobs to the OMEN GPU via SSH + tmux. Results auto-deposit into the Knowledge Yard.
tmuxGPU DispatchAuto-logSSH
📊
Full Observability
Prometheus scrapes every service. Grafana dashboards show token rates, HF sync stats, experiment metrics, and infrastructure health.
PrometheusGrafanaMetricsAlerting

🔮 Ask the Oracle

Live query into the Knowledge Yard + LangChain synthesis. No login required.

How It Flows

💬
You ask
🦜
LangChain orchestrates
🗑️
Knowledge Yard RAG recall
🖥
LMDeploy local LLM
🧬
Response + deposited
ScribeLIVE

Your organic at-a-glance AI + tech intelligence feed. Drop any story into the Knowledge Yard — it's yours forever.

💰 AI SPEND TRACKER · 24H WINDOW
Cloud Cost
If using GPT-4o
Local Cost
RTX 3090
Savings
0 input tokens0 output tokens0 calls
Waiting for LLM activity...

Enter the Ecosystem

Select a mission — or slip in as a visitor and explore freely.

AUTH · PORTAL
── or sign in ──
🧬 gentoofoo.com — local-first, cloud-optionalLangChain.js · Next.js 15 · pgvector · LMDeploy · ZeroClaw