Agents forget.
lakecode compounds.

lakecode is the coding agent with a governed, compounding memory. It learns your systems once, gets cheaper and more correct every session — and your org can see and govern everything it knows.

Private beta — we onboard design-partner teams personally, one at a time.

terminal
$ lakecode chat "add retry logic to the payment client"
 
Priming from substrate...
Decision DEC-001: exponential backoff standard
Constraint INC-4471: mandatory outbound User-Agent
Failed path: naive retry on 429 (last quarter)
Code: src/payment/client.py loaded
 
Grounded: 3 claims, 1 failed path avoided
Applied the team backoff standard with the
required gateway headers. Ready for review.

Building is easy. Maintaining is hard.

Every team can ship v1. The hard part is year two — when the original authors have moved on, the docs are stale, and the only record of why things work is scattered across closed PRs and Slack threads. Coding agents make this worse, not better.

Amnesiac

Every session starts cold and re-derives your codebase's quirks, internal APIs, and conventions. You pay the same exploration cost in tokens and time, forever.

Confidently wrong

Why a constant is 4471. Which IDs are valid. What failed last quarter. The knowledge that decides correctness was never written anywhere a model can see — no context window, fine-tune, or doc-RAG reaches it.

Ungovernable

What does the agent know? Where did it learn it? Who approved it? Black-box agent memory is a compliance non-starter — especially on a data platform.

One loop: work → capture → govern → ground

Every session's exploration becomes the next session's prior. That's the whole product.

1

Work

lakecode is a full coding agent in your terminal. It auto-primes from the substrate at session start — your org's decisions, constraints, findings, and failed paths — then gets to work.

2

Capture

What the session learns persists: findings, decisions, failed paths — and code-change rationale captured at commit time. Knowledge that never existed in any document.

3

Govern

Nothing becomes org truth silently. Claims carry provenance and confidence, flow through a review queue, and climb a promotion ladder: local → workspace → org.

4

Ground

The next session — yours or a teammate's — starts already knowing. Cheaper where you'd already be right, correct where you'd be wrong. The loop compounds.

Cheaper where you'd already be right.
Correct where you'd be wrong.

Memory does two different jobs. We measure them separately — and never blend the numbers.

Execution-bound · the answer is in the repo

~27% cheaper

~22–26% fewer tool calls, at identical correctness.

When exploration can find the answer, memory is an efficiency lever: the agent skips the wandering it already did last time. Correctness is unchanged by design — that's what defines this regime.

Auto-prime + write-back A/Bs · 2026-06-04 · claude-sonnet-4-6 both arms · 20 runs

Knowledge-bound · the answer lives in org knowledge

Claude Code ✗ ✗ ✗ 0/3
lakecode, cold ✗ ✗ ✗ 0/3
lakecode + substrate ✓ ✓ ✓ 3/3

When correctness depends on knowledge that was never in the repo, agents without it are confidently wrong — and no model upgrade fixes that. One substrate claim flips the result from 0% to 100%.

Grounded-coding bench · 2026-06-08 · claude-sonnet-4-6 everywhere · n=3 per arm · single knowledge-trap domain

Two regimes, two claims. Efficiency numbers are measured at identical correctness on execution-bound tasks; correctness numbers are knowledge-bound tasks where baselines fail. They are never the same number.

Teach it once. It stays taught.

The causal validation, end to end: same prompt, same repo, same agent — the only variable is what the substrate holds.

1 · Cold · 0/2

A fresh session is asked to add the project's mandatory outbound HTTP headers. It queries the substrate, finds nothing — and declines to invent a value. No fabrication.

2 · Teach once

The convention ships once — User-Agent: revy-fetch/4471, a gateway allow-list rule from an incident. The commit hook persists it as a claim with provenance.

3 · Warm · 2/2

A fresh, independent session retrieves the claim and writes the exact header into generated code, citing the incident — no shared context window, no fine-tune, no RAG of the diff.

▶  Recorded demo of the full cold → teach → warm run — coming with the beta. Request access to see it on your own repo.

Flywheel causal validation · 2026-06-09 · claude-sonnet-4-6 · n=2 per arm

Dated, pinned, caveated

Every number comes from a pre-registered run with the model pinned and the caveats attached. If a claim isn't dated, we don't ship it.

6/6 = 6/6

Coding parity with Claude Code — bug-fixes byte-identical.

coding-bench v1 · 2026-06 · same model both arms · small generic fixture, n=1 per task

−27% cost

~22–26% fewer tool calls at identical correctness, warm vs cold substrate.

auto-prime + write-back A/Bs · 2026-06-04 · claude-sonnet-4-6 both arms · execution-bound tasks

0/3 → 3/3

On knowledge-bound tasks, one substrate claim flips correctness from 0% to 100%.

grounded-coding bench · 2026-06-08 · claude-sonnet-4-6 · n=3 per arm, single domain

0/2 → 2/2

Fresh sessions honor a convention a prior session persisted; cold runs fabricated nothing.

flywheel causal validation · 2026-06-09 · claude-sonnet-4-6 · n=2 per arm

Single-turn pipeline economics · Stage 1 · Databricks SDK · May 2026

11 questions × 3 replicates, graded by a blind LLM judge. Both systems answer with the same model (Claude Sonnet 4.6) — the gap is the compiled context, not the model.

lakecode pipeline

  • Mean cost / question: $0.033
  • Correct: 32 / 33 (+1 partial)
  • Wrong answers: 0

Claude Code

  • Mean cost / question: $0.113
  • Correct: 33 / 33
  • Wrong answers: 0

Cheaper on every question — 2.6× on common APIs, 3.4× on library internals, 5.4× on cross-cutting questions. Caveats: coverage-limited to repo internals, and agentic baselines move fast — the Claude Code baseline itself dropped 30% in 17 days. That's why we publish absolute costs, not just ratios.

Full results & methodology

An agent your org can audit

Every claim has provenance. Nothing is promoted silently. Two surfaces, one substrate.

TERMINAL (CLI)

  • A full coding agent: lakecode chat
  • Auto-primes from the substrate at session start
  • Persists findings, decisions, and failed paths
  • Captures code-change rationale at commit time
  • Works alongside git, your editor, your workflow

CONSOLE

  • Inspect and edit every claim, with provenance
  • Review queue for proposed knowledge
  • Promotion ladder: local → workspace → org
  • Sessions, sources, and model picker
  • The substrate is your data — exportable, always

What feeds the substrate

Structured org memory — claims with provenance and confidence — built from ingestion and from what your agents learn. Refreshes incrementally on commit.

YOUR CODE

An entity graph over the codebase — structure, interfaces, tests, configs — with claims extracted and linked to the entities they describe.

YOUR SESSIONS

Findings, root causes, failed paths. The agent's own write-back is gated — proposed knowledge goes through review before it becomes org truth.

GIT HISTORY

Code-change rationale captured at commit time. The why behind changes — knowledge that never existed in any document — persists beyond the merge.

YOUR DOCS

Architecture docs, runbooks, ADRs — extracted into structured claims and linked into the same graph as the code they describe.

DATABRICKS UC

Unity Catalog metadata, table and column lineage, notebook↔table references — the lakehouse edition grounds agents in your data platform itself.

YOUR TEAM

The promotion ladder turns one engineer's finding into the whole team's prior — after review. The second engineer never rediscovers the first one's fix.

Private beta — design partners

We onboard one team at a time: ingest your repos and docs, write a golden-question acceptance set with you, and run the flywheel demo on your own code — cold fail, teach once, warm pass.

Not more context. Better context.

Hybrid retrieval — vector, lexical, graph, rerank — selects only what matters for each task, measured on a frozen canonical bench.

"add caching to the user service"

  • → caching convention (Redis, 15m TTL)
  • → user service architecture decision
  • → existing cache utility code

"fix the flaky order test"

  • → known flaky-test finding (race condition)
  • → order service test patterns
  • → CI environment differences finding

"refactor billing to use the new API"

  • → API migration decision + rationale
  • → billing module dependency graph
  • → error handling convention

Meets developers where they already work

Terminal-native agent. And the substrate serves any MCP client — Claude Code, Cursor, your editor.

terminal
$ lakecode chat "why does the auth test fail on CI?"
 
Finding: auth refactor (3 days ago)
Finding: CI env differs from local
Test: auth.test.ts line 42 assertion
 
The token expiry was changed from 1h to 15m
in the auth refactor, but the CI test still
uses a mock with 1h expiry. Fix applied.
  • Runs in your terminal

    A full coding agent alongside git, your editor, and your existing workflow. No new windows.

  • Grounded before it starts

    Auto-primes from the substrate at session start, and persists what it learns — including rationale at commit time.

  • MCP: the substrate as a service

    Your org's memory grounds Claude Code, Cursor, and any MCP client — not just the lakecode CLI.

  • Any language or framework

    TypeScript, Python, Rust, Go, Java — if it lives in a repo, lakecode can learn it.

Owned, not rented

Your models are rented. Your memory is owned — and it compounds with use.

  • A compounding asset

    Every session's exploration becomes the next session's prior. The discount and the correctness edge both grow with usage.

  • Write access — the difference vs RAG

    Doc-RAG reads what was written. lakecode writes what was learned — incident rationale, failed paths, conventions that never existed in any document.

  • Your data, exportable

    The substrate is the customer's data — always exportable. The product is what makes it compound: retrieval tuning, promotion governance, platform projection.

The Databricks edition

Grounded in your lakehouse. Governed in Unity Catalog. lakecode ingests UC lineage and notebook graphs — and projects what your agents know back into UC, so your platform team governs agents with the tooling they already trust.

  • → Ingests UC metadata, lineage, notebooks
  • → Agent knowledge projected into UC lineage & audit
  • → Ships as an in-workspace Databricks App
The enterprise story

Stop re-teaching your agent

lakecode is in private beta with design-partner teams. Your repo, your golden questions, the flywheel demo on your own code.