Databricks work too often lives in the UI: chasing failed runs, checking logs, tweaking compute, hunting permissions, verifying outputs, and explaining spend. Lakecode turns those UI workflows into repeatable terminal workflows with plan → execute → evidence.
Lakecode is a terminal-first Databricks engineering agent that helps you:
Lakecode isn't "just a model + tools." It's a workflow engine that turns Databricks work into repeatable runbooks.
On startup, lakecode generates a cached workspace profile (catalogs, schemas, table counts, warehouses, recent failures) so it can reason with the same context you'd normally hunt down in the UI.
Most steps are deterministic: SQL queries, API calls, log pulls, metadata inspection. These run through a step-based workflow engine with retries, abort handling, and progress updates.
Lakecode uses an LLM for interpreting error logs, ranking likely root causes, writing clean summaries, and turning raw outputs into actionable reports.
Plan/approval logic for complex or risky work, plus guardrails that prevent common mistakes (like treating function schemas as tables).
For workflows like /prove, lakecode produces an evidence pack (schema, metrics, samples, assessments) so the outcome is shareable and reproducible.
Opinionated runbooks for real engineer workflows.
/status job <id>
Job config + last 5 runs
/logs <run-id>
Run output + error analysis
/debug job <id>
End-to-end diagnosis: config → failed run → errors → query history → root-cause summary
/queries job <id>
SQL query history for the latest run
/queries run <run-id>
SQL query history for a specific run
/prove <table>
Evidence pack: schema, row counts, nulls, freshness, numeric stats, duplicates, samples + health assessment
/audit jobs
Scans jobs for SLA gaps, missing timeouts, runtime variability, failures; outputs risk-scored report
/cost top
Top spend drivers from billing system tables
/cost spike
Compare day vs baseline; identify likely drivers
/deploy <local> <ws-path>
Import file into workspace
/run job <id>
Trigger job run + poll completion
Lakecode ships opinionated runbooks for real engineer workflows: diagnose a failure, prove correctness, explain a cost spike, audit operational risk, deploy safely.
Lakecode generates evidence you can attach to a PR, incident, or audit trail — not just a success/fail status.
Instead of bouncing between Jobs UI, cluster logs, UC permissions, SQL editor, and dashboards, lakecode collects the context and turns it into a single actionable report.
Lakecode runs on your machine and connects to Databricks using your existing authentication (for example via the Databricks CLI profile).
Lakecode only sees what your Databricks identity can see. If you don't have access to a catalog, schema, or table, lakecode won't either.
Plan/approval logic for complex or risky operations, plus guardrails that prevent common high-cost or high-risk mistakes.
Workflows produce reproducible artifacts so teams can share results instead of relying on screenshots and Slack messages.
Install lakecode and start running workflows in minutes.
No. Lakecode uses Databricks APIs and SQL to collect facts, but the value is the workflow engine: it chains steps, manages context, and produces a consistent report/evidence outcome.
No. You describe the goal ("debug this job", "prove this table"), and lakecode executes the steps and explains results.
Some commands are read-only; others (like /deploy or /run job) perform actions. Lakecode is designed to be safe and explicit about what it's doing.
Lakecode supports multiple model adapters (Anthropic + OpenAI today), and you can choose what to use based on your preferences and environment.
Think of lakecode as terminal-native Databricks runbooks + mission control, optimized for reproducibility and evidence — whereas Assistant/AI Dev Kit are primarily tool/assistant building blocks.
What lakecode is and what problem it solves — turning UI-driven Databricks work into repeatable, auditable workflows.
Read →The step-based workflow engine: deterministic steps, LLM steps, data passing, retries, and abort handling.
Read →Complete reference for all available commands across debugging, data quality, platform ops, and deployment.
Read →What evidence packs are, how /prove uses them today, and the roadmap for structured JSON outputs and replay plans.
Read →Conventions files, skills injection, default catalogs/schemas, naming patterns, and team standards.
Read →Plan/approval flow, guardrails, output truncation, and the roadmap for policy packs and org settings.
Read →Lakecode is a terminal-native Databricks engineering agent. It turns UI-driven Databricks work — debugging runs, validating data, analyzing spend, auditing jobs — into repeatable workflows you can run from the terminal.
Databricks is powerful, but many critical workflows remain:
Lakecode makes those workflows explicit and repeatable, producing evidence you can share.
Lakecode is built around a step-based workflow engine:
Each workflow can:
Lakecode also maintains a lightweight workspace profile to reduce "where is that job/table/schema?" context switching.
An evidence pack is a set of outputs (metrics, schema snapshots, samples, summaries) produced by a workflow so the result can be shared and reviewed.
/prove produces a table health evidence pack.
Lakecode will standardize evidence packs across every workflow (debug, cost, governance, deploy), including structured JSON outputs and replay plans.
Lakecode loads conventions from:
~/.lakecode/conventions.md.lakecode/conventions.md (project-level)Use conventions to teach lakecode:
Lakecode can load relevant Databricks skills based on keywords in your requests. Skills are treated as guidance; workflows remain deterministic where possible.
Lakecode is designed to be safe by default:
As more write-capable workflows are added, policy packs and org-managed settings will become first-class.
Stop chasing UI workflows. Start producing evidence.