Files
workspace/data/knowledge-builder-mvp.md

7.1 KiB

Knowledge Builder MVP — Product Spec

Defined by D J — 2026-02-15 | Revised 2026-02-15 (v3: AgentZero self-processing + NotebookLM UI)

What It Is

A deployment tool that spins up pre-configured AgentZero containers and hands them source material to self-process. The Builder does NOT process data itself — it launches the expert and tells it what to learn. Each deployed agent includes a NotebookLM-style frontend for ongoing knowledge evolution.

Architecture

Layer 1: Builder UI (the "factory") — Lightweight Launcher

Our onboarding application. Collects config and deploys containers. Does NOT download or process any data.

Onboarding Workflow:

  1. Create a new project (name, description, persona/domain)
  2. Add data sources — paste YouTube URLs, upload files, add web pages
  3. Configure AI backend: Claude API key, OpenAI key, or local llama.cpp/Ollama endpoint
  4. Configure agent persona: name, system prompt, behavior guidelines
  5. Deploy: Spins up an AgentZero container, passes source list + config
  6. AgentZero takes over — downloads, processes, and learns from the sources autonomously
  7. User gets a running expert agent with a link to access it

Data sources supported (passed to AgentZero, NOT processed by Builder):

  • YouTube videos (single URL)
  • YouTube channels (bulk)
  • PDFs (uploaded files copied into container)
  • Text files
  • Web pages / URLs
  • Any other parseable format

Layer 2: AgentZero Container (the "expert") — Self-Processing

Each deployed container:

  • Receives source list from Builder at deploy time
  • Downloads and processes its own data — YouTube (yt-dlp), PDFs, web scraping, transcription
  • Builds its own knowledge base — chunks, embeds, stores in its local RAG
  • Has full agent capabilities — code execution, terminal, web search, tool creation
  • Knows how to consume all data formats — Builder ensures this via system prompt + pre-installed tools
  • Self-contained — runs independently once deployed

Layer 3: NotebookLM-Style Frontend (inside container)

A separate web UI inside each container for ongoing knowledge management:

  • Source Manager — add new sources (YouTube URLs, upload PDFs, paste text, web URLs)
  • Knowledge Browser — see what the agent knows, browse indexed content, view source citations
  • Chat Interface — ask questions, get answers grounded in the knowledge base
  • Audio Overview — generate podcast-style summaries (stretch goal)
  • Processing Status — see what's being ingested, progress, errors
  • Refinement Tools — correct the agent, add context, mark important sections

This is how users continuously evolve their expert after initial deployment.

Deployment Flow

1. Builder UI collects: source list + LLM config + persona
2. Builder spins up AgentZero container (docker run)
3. Builder injects into container:
   a. Source manifest (URLs, file paths) — NOT processed data
   b. System prompt with persona + data processing instructions
   c. LLM config (API keys, endpoints)
   d. Pre-installed tools for: yt-dlp, PDF parsing, web scraping, Whisper transcription
   e. NotebookLM frontend (served on a separate port or path)
4. AgentZero boot sequence:
   a. Reads source manifest
   b. Downloads/processes each source autonomously
   c. Chunks and embeds into local RAG
   d. Reports status via NotebookLM frontend
5. Container is ready — user accesses:
   - AgentZero chat UI (port 50001)
   - NotebookLM knowledge manager (port 50002 or /notebook path)

What Builder Must Ensure at Deploy Time

The Builder is responsible for making sure each AgentZero instance can consume all data formats:

  • yt-dlp installed in container for YouTube downloads
  • Whisper available (local model or remote endpoint to GPU box)
  • PDF parser installed (pdftotext, pdf-parse, or similar)
  • Web scraper available (requests + beautifulsoup or similar)
  • Embedding model configured (Ollama nomic-embed-text or similar)
  • Vector DB running inside container (ChromaDB or similar)
  • System prompt includes instructions for processing the source manifest on first boot
  • Processing tools as AgentZero custom tools in python/tools/

NotebookLM Frontend Requirements

Tech Stack: Next.js 15 + Tailwind v4 + ShadCN UI + Lucide + TypeScript (matches Builder)

Pages/Features:

  1. Dashboard — overview of knowledge base (source count, chunk count, last updated)
  2. Sources — list all sources, add new ones, see processing status, remove sources
  3. Chat — conversational interface with source citations in responses
  4. Browse — explore indexed content, search within knowledge base
  5. Settings — LLM config, embedding model, persona settings

Key Behaviors:

  • Adding a new source triggers AgentZero to process it (same pipeline as initial deploy)
  • Chat responses include citations linking back to source chunks
  • Processing is async — user sees real-time status updates

Tech Stack

  • Builder UI: Next.js 15 + Tailwind v4 + Framer Motion + ShadCN UI + Lucide + TypeScript
  • NotebookLM Frontend: Same stack (bundled into container)
  • Generated Containers: AgentZero (Python, Docker)
  • Transcription: Faster Whisper (inside container or remote GPU endpoint)
  • Embeddings: nomic-embed-text via Ollama (inside container or remote)
  • Vector DB: ChromaDB (inside container)
  • Container Runtime: Docker

Infrastructure

  • Builder runs on VM (192.168.86.45:3001)
  • Docker on VM for spinning up AgentZero containers
  • GPU box (192.168.86.40) available as remote Whisper endpoint
  • Generated containers are portable — run anywhere with Docker

What We Already Have

  • Builder UI scaffolded (Next.js 16, running at :3001, systemd service)
  • Onboarding flow (project create, source input)
  • Docker installed on VM
  • AgentZero Docker image available
  • yt-dlp working
  • Faster Whisper on GPU box
  • ChromaDB infrastructure
  • Ollama + nomic-embed-text

What Needs Building

  • Refactor Builder — remove all processing logic, make it a pure launcher
  • Container deployment engine — spin up AgentZero, inject config + source manifest
  • AgentZero boot processor — custom tool that reads source manifest and processes on startup
  • Data format tools for AgentZero — YouTube downloader, PDF parser, web scraper, transcription
  • NotebookLM frontend — full knowledge management UI (sources, chat, browse, settings)
  • Bundle NotebookLM frontend into container — served alongside AgentZero
  • Container management UI in Builder — list running agents, start/stop, access links
  • Unit tests — pipeline tools, API routes, deployment engine
  • E2E tests — full flow: create project → add sources → deploy → verify agent processes data → chat works

Pricing Target

Tier Price Includes
Free $0 1 agent, 50 docs, local LLM only
Personal $12/mo 5 agents, 500 docs, cloud LLM
Pro $25/mo Unlimited agents, API access, priority processing
Self-Hosted Free (open core) Full feature set, BYOLLM
Enterprise Custom SSO, audit logs, dedicated support