7.1 KiB
Knowledge Builder MVP — Product Spec
Defined by D J — 2026-02-15 | Revised 2026-02-15 (v3: AgentZero self-processing + NotebookLM UI)
What It Is
A deployment tool that spins up pre-configured AgentZero containers and hands them source material to self-process. The Builder does NOT process data itself — it launches the expert and tells it what to learn. Each deployed agent includes a NotebookLM-style frontend for ongoing knowledge evolution.
Architecture
Layer 1: Builder UI (the "factory") — Lightweight Launcher
Our onboarding application. Collects config and deploys containers. Does NOT download or process any data.
Onboarding Workflow:
- Create a new project (name, description, persona/domain)
- Add data sources — paste YouTube URLs, upload files, add web pages
- Configure AI backend: Claude API key, OpenAI key, or local llama.cpp/Ollama endpoint
- Configure agent persona: name, system prompt, behavior guidelines
- Deploy: Spins up an AgentZero container, passes source list + config
- AgentZero takes over — downloads, processes, and learns from the sources autonomously
- User gets a running expert agent with a link to access it
Data sources supported (passed to AgentZero, NOT processed by Builder):
- YouTube videos (single URL)
- YouTube channels (bulk)
- PDFs (uploaded files copied into container)
- Text files
- Web pages / URLs
- Any other parseable format
Layer 2: AgentZero Container (the "expert") — Self-Processing
Each deployed container:
- Receives source list from Builder at deploy time
- Downloads and processes its own data — YouTube (yt-dlp), PDFs, web scraping, transcription
- Builds its own knowledge base — chunks, embeds, stores in its local RAG
- Has full agent capabilities — code execution, terminal, web search, tool creation
- Knows how to consume all data formats — Builder ensures this via system prompt + pre-installed tools
- Self-contained — runs independently once deployed
Layer 3: NotebookLM-Style Frontend (inside container)
A separate web UI inside each container for ongoing knowledge management:
- Source Manager — add new sources (YouTube URLs, upload PDFs, paste text, web URLs)
- Knowledge Browser — see what the agent knows, browse indexed content, view source citations
- Chat Interface — ask questions, get answers grounded in the knowledge base
- Audio Overview — generate podcast-style summaries (stretch goal)
- Processing Status — see what's being ingested, progress, errors
- Refinement Tools — correct the agent, add context, mark important sections
This is how users continuously evolve their expert after initial deployment.
Deployment Flow
1. Builder UI collects: source list + LLM config + persona
2. Builder spins up AgentZero container (docker run)
3. Builder injects into container:
a. Source manifest (URLs, file paths) — NOT processed data
b. System prompt with persona + data processing instructions
c. LLM config (API keys, endpoints)
d. Pre-installed tools for: yt-dlp, PDF parsing, web scraping, Whisper transcription
e. NotebookLM frontend (served on a separate port or path)
4. AgentZero boot sequence:
a. Reads source manifest
b. Downloads/processes each source autonomously
c. Chunks and embeds into local RAG
d. Reports status via NotebookLM frontend
5. Container is ready — user accesses:
- AgentZero chat UI (port 50001)
- NotebookLM knowledge manager (port 50002 or /notebook path)
What Builder Must Ensure at Deploy Time
The Builder is responsible for making sure each AgentZero instance can consume all data formats:
- yt-dlp installed in container for YouTube downloads
- Whisper available (local model or remote endpoint to GPU box)
- PDF parser installed (pdftotext, pdf-parse, or similar)
- Web scraper available (requests + beautifulsoup or similar)
- Embedding model configured (Ollama nomic-embed-text or similar)
- Vector DB running inside container (ChromaDB or similar)
- System prompt includes instructions for processing the source manifest on first boot
- Processing tools as AgentZero custom tools in python/tools/
NotebookLM Frontend Requirements
Tech Stack: Next.js 15 + Tailwind v4 + ShadCN UI + Lucide + TypeScript (matches Builder)
Pages/Features:
- Dashboard — overview of knowledge base (source count, chunk count, last updated)
- Sources — list all sources, add new ones, see processing status, remove sources
- Chat — conversational interface with source citations in responses
- Browse — explore indexed content, search within knowledge base
- Settings — LLM config, embedding model, persona settings
Key Behaviors:
- Adding a new source triggers AgentZero to process it (same pipeline as initial deploy)
- Chat responses include citations linking back to source chunks
- Processing is async — user sees real-time status updates
Tech Stack
- Builder UI: Next.js 15 + Tailwind v4 + Framer Motion + ShadCN UI + Lucide + TypeScript
- NotebookLM Frontend: Same stack (bundled into container)
- Generated Containers: AgentZero (Python, Docker)
- Transcription: Faster Whisper (inside container or remote GPU endpoint)
- Embeddings: nomic-embed-text via Ollama (inside container or remote)
- Vector DB: ChromaDB (inside container)
- Container Runtime: Docker
Infrastructure
- Builder runs on VM (192.168.86.45:3001)
- Docker on VM for spinning up AgentZero containers
- GPU box (192.168.86.40) available as remote Whisper endpoint
- Generated containers are portable — run anywhere with Docker
What We Already Have
- ✅ Builder UI scaffolded (Next.js 16, running at :3001, systemd service)
- ✅ Onboarding flow (project create, source input)
- ✅ Docker installed on VM
- ✅ AgentZero Docker image available
- ✅ yt-dlp working
- ✅ Faster Whisper on GPU box
- ✅ ChromaDB infrastructure
- ✅ Ollama + nomic-embed-text
What Needs Building
- Refactor Builder — remove all processing logic, make it a pure launcher
- Container deployment engine — spin up AgentZero, inject config + source manifest
- AgentZero boot processor — custom tool that reads source manifest and processes on startup
- Data format tools for AgentZero — YouTube downloader, PDF parser, web scraper, transcription
- NotebookLM frontend — full knowledge management UI (sources, chat, browse, settings)
- Bundle NotebookLM frontend into container — served alongside AgentZero
- Container management UI in Builder — list running agents, start/stop, access links
- Unit tests — pipeline tools, API routes, deployment engine
- E2E tests — full flow: create project → add sources → deploy → verify agent processes data → chat works
Pricing Target
| Tier | Price | Includes |
|---|---|---|
| Free | $0 | 1 agent, 50 docs, local LLM only |
| Personal | $12/mo | 5 agents, 500 docs, cloud LLM |
| Pro | $25/mo | Unlimited agents, API access, priority processing |
| Self-Hosted | Free (open core) | Full feature set, BYOLLM |
| Enterprise | Custom | SSO, audit logs, dedicated support |