LA Router
Welcome to the LA Router documentation — your guide to the intelligent LLM routing proxy for use with Private and Public AI models.
What is LA Router?
LA Router is a local-first AI proxy that intelligently routes your LLM requests to the best model for each task — whether that's a lightweight local private model running on llama.cpp or a powerful cloud private model.

┌──────────────────────────────────────────────┐
│ LA Router │
│ │
│ Your App ──→ /v1/chat/completions │
│ │ │
│ Hybrid Classifier │
│ (heuristic + AI fallback) │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ ▼ ▼ ▼ │
│ Local Models Cloud APIs Escalation │
│ (llama.cpp) (Gemini/Claude) Pipeline │
│ │
│ SQLite Token Tracking · Multi-Tenant │
│ WebUI Dashboard · MCP Tools │
└──────────────────────────────────────────────┘
Key Features
| Feature | Description |
|---|---|
| Hybrid Routing | Fast heuristic classification with AI fallback for ambiguous requests |
| Local-First | Route simple tasks to Gemma 4 models via llama.cpp — zero API cost |
| Multi-Tenant | Per-project bearer tokens, budgets, and routing policies |
| Token Tracking | Accurate per-request billing with model-specific cost rates |
| WebUI Dashboard | React dashboard with live stats, charts, and model management |
| MCP Tools | delegate_to_expert tool for AI agent orchestration |
| OpenAI Compatible | Drop-in /v1/chat/completions proxy for any OpenAI client |
Dashboard Screens
| Models | Usage |
|---|---|
![]() | ![]() |
Documentation Index
| Section | Description |
|---|---|
| Overview | Architecture vision, routing tiers, and design philosophy |
| Architecture | System design, data flow, and component breakdown |
| API Reference | REST API endpoints for proxy, billing, and management |
| CLI | Command-line interface for administration and testing |
| MCP Tools | Model Context Protocol tool integration |
| Model Catalog | Gemma 4 model variants, specs, and download info |
| Configuration | Environment variables, config files, and customization |
Use the sidebar to navigate topics, or the search bar to find specific content.

