Cloud Models

When a task exceeds what local models can handle — complex reasoning, frontier-level code generation, or large-context analysis — LA Router seamlessly escalates to cloud model APIs. This happens transparently, with the same /v1/chat/completions interface.

How Cloud Routing Works

LA Router's classifier assigns each request a complexity tier. Tasks classified as Complex or Frontier are automatically routed to cloud APIs:

Your App → LA Router → Classifier
                          │
              ┌───────────┼───────────┐
              ▼                       ▼
          Complex                 Frontier
       (Cloud API)             (Cloud API)
              │                       │
              ▼                       ▼
        Gemini Pro               Claude Opus
        GPT-4o                   Gemini Ultra

Private Cloud Models

For organizations that require data sovereignty but need cloud-scale compute, LA Router supports private cloud deployments:

Self-Hosted LLM Servers

Route to models running on your own cloud infrastructure — private GPU clusters, VPCs, or on-premises data centers:

# Configure a private cloud endpoint in .env
PRIVATE_CLOUD_URL=https://llm.internal.yourcompany.com/v1
PRIVATE_CLOUD_API_KEY=your-internal-key

LA Router treats private cloud endpoints identically to public cloud APIs, with the same routing, token tracking, and billing features — but your data never leaves your infrastructure.

Key Use Cases for Private Cloud

Use Case	Description
Regulated industries	Healthcare, finance, and legal where data cannot leave corporate networks
Large-scale inference	Tasks requiring GPU clusters beyond what a single workstation provides
Fine-tuned cloud models	Organization-specific models deployed on private infrastructure
Geographic compliance	Data residency requirements (GDPR, HIPAA, SOC 2)

Public Cloud Models

For maximum capability on non-sensitive tasks, LA Router integrates with the leading public cloud LLM providers:

Supported Providers

Provider	Models	Best For
Google Gemini	Gemini Flash, Gemini Pro, Gemini Ultra	Fast general-purpose tasks, multimodal
Anthropic	Claude Sonnet, Claude Opus	Complex reasoning, long-context analysis
OpenAI	GPT-4o, GPT-4o-mini, o1	Code generation, structured output

Configuration

Each provider is configured via API keys in your .env file:

# Public cloud API keys
GOOGLE_API_KEY=AIza...
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...

LA Router will automatically select the best provider based on the task classification and your configured routing preferences.

Routing Decision: Local vs Cloud

LA Router makes the local-vs-cloud decision based on several factors:

Cost Optimization

One of LA Router's core benefits is automatic cost optimization. By routing simple tasks to free local models, you can dramatically reduce your API spend:

Tier	Model	Cost per 1M Tokens
Heartbeat	Local 2B	$0.00
Simple	Local 4B	$0.00
Moderate	Local 26B	$0.00
Complex	Gemini Pro	~$1.25
Frontier	Claude Opus	~$15.00

Cost Savings

Organizations typically see 60–80% cost reduction by routing Heartbeat, Simple, and Moderate tasks to local models — which represent the majority of LLM calls in most applications.

Token Tracking

Regardless of whether a request goes to a local or cloud model, LA Router tracks all token usage with per-project, per-model granularity:

Input tokens and output tokens counted separately
Cost calculated using model-specific pricing
Per-project budgets with alerting and hard caps
Usage dashboard with charts and breakdowns

Usage Dashboard

Privacy Model Summary

Deployment	Data Leaves Network?	Cost	Capability
Local (Heartbeat/Simple)	❌ No	Free	Basic tasks
Local (Moderate)	❌ No	Free	Most business tasks
Private Cloud	❌ No (your infra)	Compute cost	Full capability
Public Cloud	⚠️ Yes (provider)	API pricing	Maximum capability

LA Router gives you full control over which tasks can be sent to external providers and which must stay local — ensuring your data privacy requirements are always met.

How Cloud Routing Works​

Private Cloud Models​

Self-Hosted LLM Servers​

Key Use Cases for Private Cloud​

Public Cloud Models​

Supported Providers​

Configuration​

Routing Decision: Local vs Cloud​

Cost Optimization​

Token Tracking​

Privacy Model Summary​