Active2025 – Present

SecondBrain

Full-stack AI knowledge assistant with dual-brain architecture — personal knowledge base plus project working memory. Neo4j knowledge graph, RAG chat with SSE streaming, and pgvector.

Pipeline Steps

384 BGE

Vector Dim

800 tokens

Chunk Size

PG + Neo4j

Storage

Architecture

FastAPI with PostgreSQL + pgvector (384-dim BGE vectors) + Neo4j
9-step document pipeline: load → semantic chunk → embed → LLM entity extraction → hash dedup → entity consolidation → graph sync → relationship detection → visualization API
RAG chat with SSE streaming and hybrid retrieval — pgvector for similarity, Neo4j for relationship traversal
Dual brain: Layer 1 (personal KB — concepts, experiences, learnings) and Layer 2 (project memory — active projects, tasks, brainstorms)
Knowledge graduation: completed project learnings auto-promote to personal KB

Key Decisions

Hybrid storage (pgvector + Neo4j) over single DB

Why: Similarity search and relationship traversal serve genuinely different query patterns

Tradeoff: Operational complexity of two databases, Cypher queries to maintain

LLM boundary detection for semantic chunking

Why: Better retrieval quality than fixed-size chunking

Tradeoff: 3–5x slower ingestion, but background task so acceptable

Technologies

FastAPIPostgreSQLpgvectorNeo4jSSE Streaming

What I Learned

Hybrid storage (pgvector + Neo4j) is worth the complexity when you need both similarity search and relationship traversal.
Semantic chunking with LLM boundary detection produces better retrieval than fixed-size chunking, but it's 3–5x slower.
Entity consolidation (merging 'React', 'React.js', 'ReactJS' into one canonical entity) is harder than expected and still isn't perfect.

Back to Portfolio