Feed Hunter: deep scraper skill, pipeline, simulator, first investigation

- Built deep-scraper skill (CDP-based X feed extraction) - Three-stage pipeline: scrape → triage → investigate - Paper trading simulator with position tracking - First live investigation: verified kch123 Polymarket profile ($9.3M P&L) - Opened first paper position: Seahawks Super Bowl @ 68c - Telegram alerts with inline action buttons - Portal build in progress (night shift)
2026-02-07 23:58:40 -06:00
parent b93228ddc2
commit 8638500190
31 changed files with 7752 additions and 40 deletions
--- a/memory/2026-02-07.md
+++ b/memory/2026-02-07.md
@ -1,49 +1,71 @@
-# 2026-02-07
+# 2026-02-07 — Server Recovery + Feed Hunter

 ## Server Recovery
- Server was down for 7 days (01-31 to 02-07)
- D J got it back up, we recovered cleanly
- Time capsule from 01-31 opened on schedule
+- Back online after 7-day outage (01-31 to 02-07)
+- Updated OpenClaw v2026.2.6-3
+- Fixed Proxmox noVNC: disabled Wayland, switched to X11
+- Enabled auto-login for wdjones in GDM

-## Updates Applied
- OpenClaw updated to 2026.2.6-3
- Fixed Proxmox noVNC issue (Wayland → X11)
- Enabled auto-login for wdjones
+## ChromaDB + Browser Setup
+- ChromaDB collection `openclaw-memory` with cosine distance
+- chromadb-memory plugin working with auto-recall
+- Google Chrome installed, headless browser pipeline verified
+- Sub-agent spawning tested

-## New Infrastructure
- ChromaDB running on LXC at 192.168.86.25:8000
- Ollama at 192.168.86.137:11434 (qwen3:8b, qwen3:30b, glm-4.7-flash, nomic-embed-text)
- chromadb-memory plugin live with auto-recall
- 9 documents indexed for semantic memory search
+## Feed Hunter Project (NEW)
+Built a full X/Twitter feed intelligence pipeline:

-## Browser Capability
- Installed Google Chrome for headless screenshots
- OpenClaw browser tool configured and working
- Can open URLs, screenshot, analyze with vision
- D J wants this used to visually verify web projects before delivery
+### Architecture
+1. **Scrape** — CDP-based DOM extraction (not screenshots)
+   - Chrome launched with `--remote-debugging-port=9222 --remote-allow-origins=*`
+   - Must use copied profile (chrome-debug) — Chrome refuses debug port on default profile path
+   - Extracts: author, text, timestamp, metrics, links, media, cards, repost info
+   
+2. **Triage** — Pattern matching for verifiable claims
+   - Performance claims, copy trading, arbitrage, prediction markets, price targets, airdrops
+   - Priority scoring, investigation task generation
+   
+3. **Investigate** — Agent follows links, verifies claims
+   - Uses browser tool to pull real data from Polymarket, exchanges, etc.
+   - Generates verdicts: ACTIONABLE / EXPIRED / EXAGGERATED / SCAM / UNVERIFIABLE

-## Key Decisions
- D J wants local LLM (Qwen) as Claude fallback for cost/insurance
- Ollama setup for Qwen still pending (model routing config)
- Browser visual QA is standard workflow going forward
+4. **Alert** — Telegram notifications with inline action buttons
+   - Simulate This / Backtest First / Skip

-## X Feed Analysis Project
- D J wants automated analysis of X/Twitter posts about money-making (crypto, trading, polymarket, arbitrage)
- Built x-feed-scraper.sh — scrolls X feed via xdotool, takes screenshots with ImageMagick
- Pipeline: scrape → screenshot → vision analysis → categorize → verdict (valid/expired/spam/sensationalized)
- Sub-agents run analysis in parallel (2 batches of 4 pages)
- Test run found 2 relevant posts out of ~15: one sensationalized crypto hype, one paid stock promo
- Chrome must be launched with --no-sandbox on this VM
- X cookies are encrypted at rest — browser automation is the reliable free path
- D J's X handle: logged in via desktop Chrome on the VM
+5. **Simulate** — Paper trading system
+   - Virtual bankroll ($1000 default)
+   - Position tracking, P&L, stop losses, take profits
+   - Performance stats: win rate, ROI, by-strategy breakdown

-## Infrastructure Notes
- pkill chrome kills OpenClaw headless browser too — be careful, causes gateway disruption
- Desktop Chrome and OpenClaw headless Chrome are separate instances
- xdotool installed for keyboard/mouse automation
- ImageMagick `import` works for full-screen captures
- Chrome user data dir: /home/wdjones/.config/google-chrome
+### Files
+- `skills/deep-scraper/` — scraping skill (SKILL.md + scripts)
+- `projects/feed-hunter/` — project home
+  - `run-pipeline.sh` — full pipeline orchestrator
+  - `simulator.py` — paper trading CLI
+  - `investigate.py` — investigation task generator
+  - `config.json` — pipeline settings

-## Session Context
- This session is near compaction
- Major accomplishments today: server recovery, OpenClaw update, Proxmox VNC fix, ChromaDB memory, browser capability, X feed analysis pipeline
+### Key Discovery
+- Chrome refuses `--remote-debugging-port` when `--user-data-dir` is the default path
+- Solution: copy profile to `~/.config/google-chrome-debug/` and launch from there
+- Need `--remote-allow-origins=*` for WebSocket CDP access
+- Python needs `-u` flag for unbuffered output in pipeline scripts
+
+### First Live Investigation
+- @linie_oo claimed @kch123 has ~$10M Polymarket profit
+- Verified on Polymarket: $9,371,829 all-time P&L ✅
+- 1,862 predictions, $2.3M active positions
+- Sent investigation alert to D J with action buttons
+
+### D J's Vision
+- Scrape → investigate → verify → simulate → backtest → if viable, spawn working project
+- Everything paper-traded first to prove it works
+- Backtesting wherever historical data exists
+- Web portal to present reports and implementation details
+- D J headed to bed ~midnight, asked me to refine overnight + build portal
+
+### Night Shift Plan
+- Sub-agent building web portal at localhost:8888
+- Refine triage patterns
+- Add positions monitoring
+- Portal shows: dashboard, feed view, investigations, sim tracker, pipeline status