Back to Portfolio
Active2025 – Present
SecondBrain
Full-stack AI knowledge assistant with dual-brain architecture — personal knowledge base plus project working memory. Neo4j knowledge graph, RAG chat with SSE streaming, and pgvector.
9
Pipeline Steps
384 BGE
Vector Dim
800 tokens
Chunk Size
PG + Neo4j
Storage
Architecture
- FastAPI with PostgreSQL + pgvector (384-dim BGE vectors) + Neo4j
- 9-step document pipeline: load → semantic chunk → embed → LLM entity extraction → hash dedup → entity consolidation → graph sync → relationship detection → visualization API
- RAG chat with SSE streaming and hybrid retrieval — pgvector for similarity, Neo4j for relationship traversal
- Dual brain: Layer 1 (personal KB — concepts, experiences, learnings) and Layer 2 (project memory — active projects, tasks, brainstorms)
- Knowledge graduation: completed project learnings auto-promote to personal KB
Key Decisions
Hybrid storage (pgvector + Neo4j) over single DB
Why: Similarity search and relationship traversal serve genuinely different query patterns
Tradeoff: Operational complexity of two databases, Cypher queries to maintain
LLM boundary detection for semantic chunking
Why: Better retrieval quality than fixed-size chunking
Tradeoff: 3–5x slower ingestion, but background task so acceptable
Technologies
FastAPIPostgreSQLpgvectorNeo4jSSE Streaming
What I Learned
- Hybrid storage (pgvector + Neo4j) is worth the complexity when you need both similarity search and relationship traversal.
- Semantic chunking with LLM boundary detection produces better retrieval than fixed-size chunking, but it's 3–5x slower.
- Entity consolidation (merging 'React', 'React.js', 'ReactJS' into one canonical entity) is harder than expected and still isn't perfect.