Back to Portfolio
Active2025

Atlas / Code Context MCP

MCP server that indexes codebases via Tree-sitter AST and serves semantically relevant context to Claude Code, cutting token usage by 85–90%.

85–90%
Token Reduction
<100ms
Query Latency
15+
MCP Tools
~1k lines/sec
Indexing Speed

Architecture

  • SQLite storage layer for symbol index, relationship graph, and error KB
  • Tree-sitter AST parsing for Python, TypeScript, Go, Java
  • FAISS vector search with text fallback for semantic queries
  • Code knowledge graph with multi-granularity traversal (Method → Class → Module)
  • Token-aware chunking (max 500 tokens/chunk, 4000 token budget per query)
  • Pluggable LLM layer with content-hash caching (optional, gated by env flag)

Key Decisions

SQLite over PostgreSQL

Why: Zero-config, runs anywhere, no server process — matches developer tooling UX expectations

Tradeoff: No concurrent write support, but MCP servers are single-user

LLM features optional (ATLAS_LLM_ENABLED flag)

Why: Core value is AST-based context extraction, not LLM summaries

Tradeoff: Some features (code summarization) are degraded without LLM

Technologies

PythonMCPFAISSTree-sitterSQLite

What I Learned

  • Tree-sitter AST parsing handles edge cases (decorators, nested classes, multiline signatures) that regex misses entirely.
  • Graph-aware retrieval (traversing call chains and imports) surfaces more relevant context than pure vector similarity.
  • An error knowledge base that logs fixes alongside errors turned out to be surprisingly useful for recurring issues.