Full sync - all projects, memory, configs
This commit is contained in:
143
data/knowledge-builder-mvp.md
Normal file
143
data/knowledge-builder-mvp.md
Normal file
@ -0,0 +1,143 @@
|
||||
# Knowledge Builder MVP — Product Spec
|
||||
|
||||
*Defined by D J — 2026-02-15 | Revised 2026-02-15 (v3: AgentZero self-processing + NotebookLM UI)*
|
||||
|
||||
## What It Is
|
||||
|
||||
A deployment tool that spins up pre-configured AgentZero containers and hands them source material to self-process. The Builder does NOT process data itself — it launches the expert and tells it what to learn. Each deployed agent includes a NotebookLM-style frontend for ongoing knowledge evolution.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Layer 1: Builder UI (the "factory") — Lightweight Launcher
|
||||
Our onboarding application. Collects config and deploys containers. Does NOT download or process any data.
|
||||
|
||||
**Onboarding Workflow:**
|
||||
1. Create a new project (name, description, persona/domain)
|
||||
2. Add data sources — paste YouTube URLs, upload files, add web pages
|
||||
3. Configure AI backend: Claude API key, OpenAI key, or local llama.cpp/Ollama endpoint
|
||||
4. Configure agent persona: name, system prompt, behavior guidelines
|
||||
5. **Deploy:** Spins up an AgentZero container, passes source list + config
|
||||
6. AgentZero takes over — downloads, processes, and learns from the sources autonomously
|
||||
7. User gets a running expert agent with a link to access it
|
||||
|
||||
**Data sources supported (passed to AgentZero, NOT processed by Builder):**
|
||||
- YouTube videos (single URL)
|
||||
- YouTube channels (bulk)
|
||||
- PDFs (uploaded files copied into container)
|
||||
- Text files
|
||||
- Web pages / URLs
|
||||
- Any other parseable format
|
||||
|
||||
### Layer 2: AgentZero Container (the "expert") — Self-Processing
|
||||
Each deployed container:
|
||||
- **Receives source list from Builder** at deploy time
|
||||
- **Downloads and processes its own data** — YouTube (yt-dlp), PDFs, web scraping, transcription
|
||||
- **Builds its own knowledge base** — chunks, embeds, stores in its local RAG
|
||||
- **Has full agent capabilities** — code execution, terminal, web search, tool creation
|
||||
- **Knows how to consume all data formats** — Builder ensures this via system prompt + pre-installed tools
|
||||
- **Self-contained** — runs independently once deployed
|
||||
|
||||
### Layer 3: NotebookLM-Style Frontend (inside container)
|
||||
A separate web UI inside each container for ongoing knowledge management:
|
||||
- **Source Manager** — add new sources (YouTube URLs, upload PDFs, paste text, web URLs)
|
||||
- **Knowledge Browser** — see what the agent knows, browse indexed content, view source citations
|
||||
- **Chat Interface** — ask questions, get answers grounded in the knowledge base
|
||||
- **Audio Overview** — generate podcast-style summaries (stretch goal)
|
||||
- **Processing Status** — see what's being ingested, progress, errors
|
||||
- **Refinement Tools** — correct the agent, add context, mark important sections
|
||||
|
||||
This is how users continuously evolve their expert after initial deployment.
|
||||
|
||||
## Deployment Flow
|
||||
|
||||
```
|
||||
1. Builder UI collects: source list + LLM config + persona
|
||||
2. Builder spins up AgentZero container (docker run)
|
||||
3. Builder injects into container:
|
||||
a. Source manifest (URLs, file paths) — NOT processed data
|
||||
b. System prompt with persona + data processing instructions
|
||||
c. LLM config (API keys, endpoints)
|
||||
d. Pre-installed tools for: yt-dlp, PDF parsing, web scraping, Whisper transcription
|
||||
e. NotebookLM frontend (served on a separate port or path)
|
||||
4. AgentZero boot sequence:
|
||||
a. Reads source manifest
|
||||
b. Downloads/processes each source autonomously
|
||||
c. Chunks and embeds into local RAG
|
||||
d. Reports status via NotebookLM frontend
|
||||
5. Container is ready — user accesses:
|
||||
- AgentZero chat UI (port 50001)
|
||||
- NotebookLM knowledge manager (port 50002 or /notebook path)
|
||||
```
|
||||
|
||||
## What Builder Must Ensure at Deploy Time
|
||||
|
||||
The Builder is responsible for making sure each AgentZero instance can consume all data formats:
|
||||
- **yt-dlp** installed in container for YouTube downloads
|
||||
- **Whisper** available (local model or remote endpoint to GPU box)
|
||||
- **PDF parser** installed (pdftotext, pdf-parse, or similar)
|
||||
- **Web scraper** available (requests + beautifulsoup or similar)
|
||||
- **Embedding model** configured (Ollama nomic-embed-text or similar)
|
||||
- **Vector DB** running inside container (ChromaDB or similar)
|
||||
- **System prompt** includes instructions for processing the source manifest on first boot
|
||||
- **Processing tools** as AgentZero custom tools in python/tools/
|
||||
|
||||
## NotebookLM Frontend Requirements
|
||||
|
||||
**Tech Stack:** Next.js 15 + Tailwind v4 + ShadCN UI + Lucide + TypeScript (matches Builder)
|
||||
|
||||
**Pages/Features:**
|
||||
1. **Dashboard** — overview of knowledge base (source count, chunk count, last updated)
|
||||
2. **Sources** — list all sources, add new ones, see processing status, remove sources
|
||||
3. **Chat** — conversational interface with source citations in responses
|
||||
4. **Browse** — explore indexed content, search within knowledge base
|
||||
5. **Settings** — LLM config, embedding model, persona settings
|
||||
|
||||
**Key Behaviors:**
|
||||
- Adding a new source triggers AgentZero to process it (same pipeline as initial deploy)
|
||||
- Chat responses include citations linking back to source chunks
|
||||
- Processing is async — user sees real-time status updates
|
||||
|
||||
## Tech Stack
|
||||
- **Builder UI:** Next.js 15 + Tailwind v4 + Framer Motion + ShadCN UI + Lucide + TypeScript
|
||||
- **NotebookLM Frontend:** Same stack (bundled into container)
|
||||
- **Generated Containers:** AgentZero (Python, Docker)
|
||||
- **Transcription:** Faster Whisper (inside container or remote GPU endpoint)
|
||||
- **Embeddings:** nomic-embed-text via Ollama (inside container or remote)
|
||||
- **Vector DB:** ChromaDB (inside container)
|
||||
- **Container Runtime:** Docker
|
||||
|
||||
## Infrastructure
|
||||
- Builder runs on VM (192.168.86.45:3001)
|
||||
- Docker on VM for spinning up AgentZero containers
|
||||
- GPU box (192.168.86.40) available as remote Whisper endpoint
|
||||
- Generated containers are portable — run anywhere with Docker
|
||||
|
||||
## What We Already Have
|
||||
- ✅ Builder UI scaffolded (Next.js 16, running at :3001, systemd service)
|
||||
- ✅ Onboarding flow (project create, source input)
|
||||
- ✅ Docker installed on VM
|
||||
- ✅ AgentZero Docker image available
|
||||
- ✅ yt-dlp working
|
||||
- ✅ Faster Whisper on GPU box
|
||||
- ✅ ChromaDB infrastructure
|
||||
- ✅ Ollama + nomic-embed-text
|
||||
|
||||
## What Needs Building
|
||||
- [ ] **Refactor Builder** — remove all processing logic, make it a pure launcher
|
||||
- [ ] **Container deployment engine** — spin up AgentZero, inject config + source manifest
|
||||
- [ ] **AgentZero boot processor** — custom tool that reads source manifest and processes on startup
|
||||
- [ ] **Data format tools for AgentZero** — YouTube downloader, PDF parser, web scraper, transcription
|
||||
- [ ] **NotebookLM frontend** — full knowledge management UI (sources, chat, browse, settings)
|
||||
- [ ] **Bundle NotebookLM frontend into container** — served alongside AgentZero
|
||||
- [ ] **Container management UI in Builder** — list running agents, start/stop, access links
|
||||
- [ ] **Unit tests** — pipeline tools, API routes, deployment engine
|
||||
- [ ] **E2E tests** — full flow: create project → add sources → deploy → verify agent processes data → chat works
|
||||
|
||||
## Pricing Target
|
||||
| Tier | Price | Includes |
|
||||
|------|-------|----------|
|
||||
| Free | $0 | 1 agent, 50 docs, local LLM only |
|
||||
| Personal | $12/mo | 5 agents, 500 docs, cloud LLM |
|
||||
| Pro | $25/mo | Unlimited agents, API access, priority processing |
|
||||
| Self-Hosted | Free (open core) | Full feature set, BYOLLM |
|
||||
| Enterprise | Custom | SSO, audit logs, dedicated support |
|
||||
Reference in New Issue
Block a user