Convert unstructured data into structured knowledge. Zero data egress. 100% Data Sovereignty.
Process PDF, DOCX, PPTX, HTML, TXT, JSON and more through a single, robust AI worker pipeline.
Auto-sync Google Drive folders with incremental updates. Works with OAuth for secure access.
Advanced retrieval using Dense vectors + Sparse SPLADE vectors via Qdrant for superior accuracy.
100% self-hosted via Docker. You own your infrastructure. No external API calls for data processing.
Built with TypeScript and Python. Includes structured logging, Prometheus metrics, and rate limiting.
Automated analysis to flag poor content. Auto-fix pipeline merges short chunks and splits long ones.