85% fewer input tokens. ~84% cheaper per session.

The local brain between your editor and your LLM — a typed code graph, the docs that go with it, the git history, the chat memory, and a small local orchestrator that prunes context before any frontier call.

TypeScript JavaScript TSX / JSX Python

What it is

Codegraph parses your repo into a typed graph with embeddings, then layers in the meaning (docs, features), the history (git, coverage, production errors), and your memory (past decisions, chat distillations).

A local SLM orchestrator plans and routes; ~80% of questions are answered on-machine without touching a frontier model. Your editor asks questions over MCP and gets typed graph answers — and only the real generation work hits a paid model.

Get started

Install
npm i -g @leanlabsinnov/codegraph

Node 20+. One command gets you from zero to serving.

Run — setup, index, and serve in one command
codegraph run .

Prompts for an LLM provider if unconfigured, persists keys to ~/.codegraph/.env, runs a quick self-test, incrementally indexes the repo, and boots the MCP server with a live spinner. Add --watch to auto re-index on file changes.

Or use the interactive wizard
codegraph init

Step-by-step guided setup: LLM provider, live credential test, optional index, then codegraph serve in a second terminal and a copy-paste MCP config for Cursor, Claude Code, or Windsurf. Post-install: MCP check, Cursor & Claude rules →

Works with

How it works

Your editor asks "What's the blast radius of renaming formatPrice?"
question
Codegraph solves here
1 · Parse Tree-sitter over JS/TS/Python
2 · Graph Symbols, features, docs, commits, memory
3 · Embed Vectors for semantic search + recall
4 · MCP tools 26 typed queries · refactor · recall
5 · Local SLM Plans tool calls on-machine · routes ~80% without a frontier call · validates output
structured answer
Editor gets graph rows Typed results, not file dumps — fed to the frontier model only if it still needs to generate code

Codegraph sits between your editor and your LLM, turning vague questions into typed graph answers so the model burns far less context.

Performance

Five real developer questions on the antilist repo (66 files, ~50k source tokens) answered with raw file context vs. Codegraph MCP tools.

85%
fewer input tokens
84%
cost reduction
3.4
avg tool calls / q
Question Direct (in) MCP (in) Tools Saved
Q150,3228,978682%
Q250,3275,533389%
Q350,3249,766281%
Q450,3232,229196%
Q550,32411,912576%
Total251,62038,4181785%

Cost on gpt-4o-mini ($0.15 / M in, $0.60 / M out): $0.0387 → $0.0062 across all five questions — and 38% faster end-to-end. The same run on local llama3.1:8b via Ollama saves 63% of prompt tokens at $0 marginal cost. See full breakdown →

What you can ask

Setup guide

The fastest path is codegraph run . — it handles config, keys, indexing, and serving in one go. Add --watch to auto re-index when files change. If you used codegraph init instead, keep codegraph serve running, confirm MCP shows 26 tools, then add project rules so your assistant reaches for the graph before dumping files into context.

Post-install checklist, Cursor & Claude Code rules, and advanced setup →