The infrastructure for agents that actually run in production.
Most agent failures aren't model failures. They're infra failures — a tool call that hung, a retry that didn't, a queue that backed up at 3am. Lattice is the four primitives you'd otherwise rebuild in-house, given to you in a shape that survives shipping.
┌───────────────────────────────────────────────────────────────┐
│ your application │
│ ( anywhere — Vercel, AWS, your own K8s ) │
└───────────────────┬───────────────────────────┬───────────────┘
│ lattice.runs.create() │ lattice.trace
▼ ▼
┌───────────────────────────────────┐ ┌───────────────────────┐
│ Lattice control plane │ │ OpenTelemetry │
│ ┌──────────┐ ┌────────────────┐ │ │ ───► Datadog │
│ │ scheduler│ │ evaluator │ │ │ ───► Honeycomb │
│ │ + queue │ │ sampler │ │ │ ───► Grafana │
│ └─────┬────┘ └─────┬──────────┘ │ └───────────────────────┘
│ ▼ ▼ │
│ ┌──────────────────────────────┐ │
│ │ durable run state │ │ self-host or
│ │ (Postgres + S3) │ │ managed cloud
│ └──────────┬───────────────────┘ │
└─────────────┼───────────────────────┘
▼
┌──────────────────────────────────────────────────────────────┐
│ model + tool providers │
│ Anthropic · OpenAI · Google · your fine-tune · MCP tools │
└──────────────────────────────────────────────────────────────┘Lattice sits between your app and your model providers. Your code calls lattice.runs.create instead of the model SDK directly; we own the retry loop, the queue, the trace, and the durable state. You keep model choice and tool definitions.
Durable execution
A run is a single agent invocation that survives crashes, restarts, and provider outages. Lattice persists every step to durable storage; if your worker dies mid-run, a fresh worker resumes from the last completed step. No retry loops you have to write yourself.
import { lattice } from "@lattice/sdk";const run = await lattice.runs.create({agent: "research",input: { topic: "GLP-1 supply chain" },resumable: true, // survives worker crashtimeoutMs: 6 * 3600_000, // 6h hard cap});
Smart scheduling
Per-provider rate limits, queue-aware concurrency, retry with jittered backoff, dead-letter handling, and priority lanes. The scheduler understands that Anthropic, OpenAI, and your in-house tool servers each have different limits — and won't trip yours.
lattice.schedule.create({agent: "summarize",rate: "60/m per:org", // 60 runs / minute / orgconcurrency: 12,retry: { max: 4, backoff: "jittered-exp" },deadLetter: "failures.queue",});
Real observability
Every run produces a complete trace tree — every prompt, tool call, retry, and intermediate artifact. Traces export as native OpenTelemetry to Datadog, Grafana, Honeycomb, or your stack of choice. Replay any historical run with the exact inputs, prompt, and tool responses.
// in-app: hover any step in the trace tree to see// prompt, completion, latency, cost, model version.// or pull it programmatically:const trace = await lattice.trace.get("run_8a7f");trace.steps.forEach(s => log(s.name, s.tokens, s.cost));
Inline evals
Attach evaluators to any agent. Sample a percentage of production traffic, run it through your rubric, and surface regressions before customers do. Compare the same prompt across model variants on real production data, not on toy benchmarks.
lattice.evals.attach({agent: "support-triage",rubric: rubrics.faithfulness, // your own or built-insample: 0.1, // 10% of prod trafficalertThreshold: 0.85, // page below this score});
We won't name names. You can guess the comparisons. The differences are real.
Four things on every other AI platform's homepage — and not ours, on purpose.
We don't ship a model. Use Anthropic, OpenAI, Google, your fine-tune, or your local server.
We don't ship a vector database. Use pgvector, Pinecone, Weaviate, or whatever you have.
We don't ship a UI builder. The traces are our UI; your app is your app.
We don't auto-generate agents. The library has helpers; the architecture is yours.
We're trying to do one thing. The four-primitive infrastructure for production agents. The other things on the AI-platform menu are real problems, but they're not our problem, and we'd be worse at them than the focused tools you should already use.