___ _ ___ _
/ __\___ __| | ___ / _ \_ __ __ _ _ __ | |__
/ / / _ \ / _` |/ _ \/ /_\/ '__/ _` | '_ \| '_ \
/ /__| (_) | (_| | __/ /\_\| | | (_| | |_) | | | |
\____/\___/ \__,_|\___\____/|_| \__,_| .__/|_| |_|
|_|
85% fewer input tokens. ~84% cheaper per session.
The local brain between your editor and your LLM — a typed code graph, the docs that go with it, the git history, the chat memory, and a small local orchestrator that prunes context before any frontier call.
Codegraph parses your repo into a typed graph with embeddings, then layers in the meaning (docs, features), the history (git, coverage, production errors), and your memory (past decisions, chat distillations).
A local SLM orchestrator plans and routes; ~80% of questions are answered on-machine without touching a frontier model. Your editor asks questions over MCP and gets typed graph answers — and only the real generation work hits a paid model.
npm i -g @leanlabsinnov/codegraph
Node 20+. One command gets you from zero to serving.
codegraph run .
Prompts for an LLM provider if unconfigured, persists keys to ~/.codegraph/.env, runs a quick self-test, incrementally indexes the repo, and boots the MCP server with a live spinner. Add --watch to auto re-index on file changes.
codegraph init
Step-by-step guided setup: LLM provider, live credential test, optional index, then codegraph serve in a second terminal and a copy-paste MCP config for Cursor, Claude Code, or Windsurf. Post-install: MCP check, Cursor & Claude rules →
.md docs; git log; lcov/Cobertura; SentryCodegraph sits between your editor and your LLM, turning vague questions into typed graph answers so the model burns far less context.
Five real developer questions on the antilist repo (66 files, ~50k source tokens) answered with raw file context vs. Codegraph MCP tools.
| Question | Direct (in) | MCP (in) | Tools | Saved |
|---|---|---|---|---|
| Q1 | 50,322 | 8,978 | 6 | 82% |
| Q2 | 50,327 | 5,533 | 3 | 89% |
| Q3 | 50,324 | 9,766 | 2 | 81% |
| Q4 | 50,323 | 2,229 | 1 | 96% |
| Q5 | 50,324 | 11,912 | 5 | 76% |
| Total | 251,620 | 38,418 | 17 | 85% |
Cost on gpt-4o-mini ($0.15 / M in, $0.60 / M out): $0.0387 → $0.0062 across all five questions — and 38% faster end-to-end. The same run on local llama3.1:8b via Ollama saves 63% of prompt tokens at $0 marginal cost. See full breakdown →
useAuth and what's the blast radius of renaming it?services/auth.py and what's changed there in the last 30 days?format_price?packages/payments.formatPrice to formatCurrency everywhere — deterministically.The fastest path is codegraph run . — it handles config, keys, indexing, and serving in one go. Add --watch to auto re-index when files change. If you used codegraph init instead, keep codegraph serve running, confirm MCP shows 26 tools, then add project rules so your assistant reaches for the graph before dumping files into context.
Post-install checklist, Cursor & Claude Code rules, and advanced setup →