Active2025 – Present

amoOS

AI-powered personal operating system with RAG-based semantic search, distributed microservices on Cloudflare edge, and a Telegram bot for natural-language knowledge capture.

GitHub

~$11/mo

Infra Cost

10-step

Ingestion Pipeline

4 (tiered)

LLM Models

384

Embedding Dim

Architecture

Cloudflare Pages (Next.js 16 + React 19) frontend, Cloudflare Workers gateway (Hono.js) with KV sessions, Railway FastAPI backend
Three-layer memory model: personal KB (Layer 1), project working memory (Layer 2), episodic activity context (Layer 3)
PostgreSQL + pgvector (384-dim BGE vectors) for similarity search, Neo4j for relationship traversal
10-step document ingestion: upload → load → semantic chunk → embed → LLM entity extraction → hash dedup → entity consolidation → graph sync → relationship detection → 3D visualization
Tiered LLM strategy: GPT-4o for PRDs, gpt-4o-mini for chat, Groq llama-3.3-70b for planning, local embeddings
Content-hash caching on LLM calls reduced costs ~60% by skipping unchanged documents

Key Decisions

Cloudflare free tier + Railway (~$10) over AWS/GCP

Why: Keep infrastructure under $15/month for a personal tool

Tradeoff: Limited compute on free tier, but sufficient for single-user workload

Local BGE-small embeddings over OpenAI ada-002

Why: Eliminate per-request embedding costs entirely

Tradeoff: 384-dim vs 1536-dim — lower dimensionality, but retrieval quality is adequate for personal KB size

Semantic chunking (800 tokens, 150 overlap) with LLM boundary detection

Why: Produces better retrieval results than naive fixed-size chunking

Tradeoff: 3–5x slower ingestion, but ingestion is a background task

Technologies

Next.jsFastAPIpgvectorNeo4jCloudflare WorkersRAG

What I Learned

SSE is better than WebSocket for LLM streaming — works through CDNs, auto-reconnects, and is simpler to implement.
pgvector is sufficient for under 100K chunks; Neo4j adds value for relationship traversal but not similarity search.
SQLAlchemy PostgreSQL enums are case-sensitive, which caused subtle bugs during migration.

Back to Portfolio