Files
workspace/data/knowledge-builder-mvp.md

144 lines
7.1 KiB
Markdown

# Knowledge Builder MVP — Product Spec
*Defined by D J — 2026-02-15 | Revised 2026-02-15 (v3: AgentZero self-processing + NotebookLM UI)*
## What It Is
A deployment tool that spins up pre-configured AgentZero containers and hands them source material to self-process. The Builder does NOT process data itself — it launches the expert and tells it what to learn. Each deployed agent includes a NotebookLM-style frontend for ongoing knowledge evolution.
## Architecture
### Layer 1: Builder UI (the "factory") — Lightweight Launcher
Our onboarding application. Collects config and deploys containers. Does NOT download or process any data.
**Onboarding Workflow:**
1. Create a new project (name, description, persona/domain)
2. Add data sources — paste YouTube URLs, upload files, add web pages
3. Configure AI backend: Claude API key, OpenAI key, or local llama.cpp/Ollama endpoint
4. Configure agent persona: name, system prompt, behavior guidelines
5. **Deploy:** Spins up an AgentZero container, passes source list + config
6. AgentZero takes over — downloads, processes, and learns from the sources autonomously
7. User gets a running expert agent with a link to access it
**Data sources supported (passed to AgentZero, NOT processed by Builder):**
- YouTube videos (single URL)
- YouTube channels (bulk)
- PDFs (uploaded files copied into container)
- Text files
- Web pages / URLs
- Any other parseable format
### Layer 2: AgentZero Container (the "expert") — Self-Processing
Each deployed container:
- **Receives source list from Builder** at deploy time
- **Downloads and processes its own data** — YouTube (yt-dlp), PDFs, web scraping, transcription
- **Builds its own knowledge base** — chunks, embeds, stores in its local RAG
- **Has full agent capabilities** — code execution, terminal, web search, tool creation
- **Knows how to consume all data formats** — Builder ensures this via system prompt + pre-installed tools
- **Self-contained** — runs independently once deployed
### Layer 3: NotebookLM-Style Frontend (inside container)
A separate web UI inside each container for ongoing knowledge management:
- **Source Manager** — add new sources (YouTube URLs, upload PDFs, paste text, web URLs)
- **Knowledge Browser** — see what the agent knows, browse indexed content, view source citations
- **Chat Interface** — ask questions, get answers grounded in the knowledge base
- **Audio Overview** — generate podcast-style summaries (stretch goal)
- **Processing Status** — see what's being ingested, progress, errors
- **Refinement Tools** — correct the agent, add context, mark important sections
This is how users continuously evolve their expert after initial deployment.
## Deployment Flow
```
1. Builder UI collects: source list + LLM config + persona
2. Builder spins up AgentZero container (docker run)
3. Builder injects into container:
a. Source manifest (URLs, file paths) — NOT processed data
b. System prompt with persona + data processing instructions
c. LLM config (API keys, endpoints)
d. Pre-installed tools for: yt-dlp, PDF parsing, web scraping, Whisper transcription
e. NotebookLM frontend (served on a separate port or path)
4. AgentZero boot sequence:
a. Reads source manifest
b. Downloads/processes each source autonomously
c. Chunks and embeds into local RAG
d. Reports status via NotebookLM frontend
5. Container is ready — user accesses:
- AgentZero chat UI (port 50001)
- NotebookLM knowledge manager (port 50002 or /notebook path)
```
## What Builder Must Ensure at Deploy Time
The Builder is responsible for making sure each AgentZero instance can consume all data formats:
- **yt-dlp** installed in container for YouTube downloads
- **Whisper** available (local model or remote endpoint to GPU box)
- **PDF parser** installed (pdftotext, pdf-parse, or similar)
- **Web scraper** available (requests + beautifulsoup or similar)
- **Embedding model** configured (Ollama nomic-embed-text or similar)
- **Vector DB** running inside container (ChromaDB or similar)
- **System prompt** includes instructions for processing the source manifest on first boot
- **Processing tools** as AgentZero custom tools in python/tools/
## NotebookLM Frontend Requirements
**Tech Stack:** Next.js 15 + Tailwind v4 + ShadCN UI + Lucide + TypeScript (matches Builder)
**Pages/Features:**
1. **Dashboard** — overview of knowledge base (source count, chunk count, last updated)
2. **Sources** — list all sources, add new ones, see processing status, remove sources
3. **Chat** — conversational interface with source citations in responses
4. **Browse** — explore indexed content, search within knowledge base
5. **Settings** — LLM config, embedding model, persona settings
**Key Behaviors:**
- Adding a new source triggers AgentZero to process it (same pipeline as initial deploy)
- Chat responses include citations linking back to source chunks
- Processing is async — user sees real-time status updates
## Tech Stack
- **Builder UI:** Next.js 15 + Tailwind v4 + Framer Motion + ShadCN UI + Lucide + TypeScript
- **NotebookLM Frontend:** Same stack (bundled into container)
- **Generated Containers:** AgentZero (Python, Docker)
- **Transcription:** Faster Whisper (inside container or remote GPU endpoint)
- **Embeddings:** nomic-embed-text via Ollama (inside container or remote)
- **Vector DB:** ChromaDB (inside container)
- **Container Runtime:** Docker
## Infrastructure
- Builder runs on VM (192.168.86.45:3001)
- Docker on VM for spinning up AgentZero containers
- GPU box (192.168.86.40) available as remote Whisper endpoint
- Generated containers are portable — run anywhere with Docker
## What We Already Have
- ✅ Builder UI scaffolded (Next.js 16, running at :3001, systemd service)
- ✅ Onboarding flow (project create, source input)
- ✅ Docker installed on VM
- ✅ AgentZero Docker image available
- ✅ yt-dlp working
- ✅ Faster Whisper on GPU box
- ✅ ChromaDB infrastructure
- ✅ Ollama + nomic-embed-text
## What Needs Building
- [ ] **Refactor Builder** — remove all processing logic, make it a pure launcher
- [ ] **Container deployment engine** — spin up AgentZero, inject config + source manifest
- [ ] **AgentZero boot processor** — custom tool that reads source manifest and processes on startup
- [ ] **Data format tools for AgentZero** — YouTube downloader, PDF parser, web scraper, transcription
- [ ] **NotebookLM frontend** — full knowledge management UI (sources, chat, browse, settings)
- [ ] **Bundle NotebookLM frontend into container** — served alongside AgentZero
- [ ] **Container management UI in Builder** — list running agents, start/stop, access links
- [ ] **Unit tests** — pipeline tools, API routes, deployment engine
- [ ] **E2E tests** — full flow: create project → add sources → deploy → verify agent processes data → chat works
## Pricing Target
| Tier | Price | Includes |
|------|-------|----------|
| Free | $0 | 1 agent, 50 docs, local LLM only |
| Personal | $12/mo | 5 agents, 500 docs, cloud LLM |
| Pro | $25/mo | Unlimited agents, API access, priority processing |
| Self-Hosted | Free (open core) | Full feature set, BYOLLM |
| Enterprise | Custom | SSO, audit logs, dedicated support |