Feed Hunter: deep scraper skill, pipeline, simulator, first investigation

- Built deep-scraper skill (CDP-based X feed extraction)
- Three-stage pipeline: scrape → triage → investigate
- Paper trading simulator with position tracking
- First live investigation: verified kch123 Polymarket profile ($9.3M P&L)
- Opened first paper position: Seahawks Super Bowl @ 68c
- Telegram alerts with inline action buttons
- Portal build in progress (night shift)
This commit is contained in:
2026-02-07 23:58:40 -06:00
parent b93228ddc2
commit 8638500190
31 changed files with 7752 additions and 40 deletions

View File

@ -1,49 +1,71 @@
# 2026-02-07
# 2026-02-07 — Server Recovery + Feed Hunter
## Server Recovery
- Server was down for 7 days (01-31 to 02-07)
- D J got it back up, we recovered cleanly
- Time capsule from 01-31 opened on schedule
- Back online after 7-day outage (01-31 to 02-07)
- Updated OpenClaw v2026.2.6-3
- Fixed Proxmox noVNC: disabled Wayland, switched to X11
- Enabled auto-login for wdjones in GDM
## Updates Applied
- OpenClaw updated to 2026.2.6-3
- Fixed Proxmox noVNC issue (Wayland → X11)
- Enabled auto-login for wdjones
## ChromaDB + Browser Setup
- ChromaDB collection `openclaw-memory` with cosine distance
- chromadb-memory plugin working with auto-recall
- Google Chrome installed, headless browser pipeline verified
- Sub-agent spawning tested
## New Infrastructure
- ChromaDB running on LXC at 192.168.86.25:8000
- Ollama at 192.168.86.137:11434 (qwen3:8b, qwen3:30b, glm-4.7-flash, nomic-embed-text)
- chromadb-memory plugin live with auto-recall
- 9 documents indexed for semantic memory search
## Feed Hunter Project (NEW)
Built a full X/Twitter feed intelligence pipeline:
## Browser Capability
- Installed Google Chrome for headless screenshots
- OpenClaw browser tool configured and working
- Can open URLs, screenshot, analyze with vision
- D J wants this used to visually verify web projects before delivery
### Architecture
1. **Scrape** — CDP-based DOM extraction (not screenshots)
- Chrome launched with `--remote-debugging-port=9222 --remote-allow-origins=*`
- Must use copied profile (chrome-debug) — Chrome refuses debug port on default profile path
- Extracts: author, text, timestamp, metrics, links, media, cards, repost info
2. **Triage** — Pattern matching for verifiable claims
- Performance claims, copy trading, arbitrage, prediction markets, price targets, airdrops
- Priority scoring, investigation task generation
3. **Investigate** — Agent follows links, verifies claims
- Uses browser tool to pull real data from Polymarket, exchanges, etc.
- Generates verdicts: ACTIONABLE / EXPIRED / EXAGGERATED / SCAM / UNVERIFIABLE
## Key Decisions
- D J wants local LLM (Qwen) as Claude fallback for cost/insurance
- Ollama setup for Qwen still pending (model routing config)
- Browser visual QA is standard workflow going forward
4. **Alert** — Telegram notifications with inline action buttons
- Simulate This / Backtest First / Skip
## X Feed Analysis Project
- D J wants automated analysis of X/Twitter posts about money-making (crypto, trading, polymarket, arbitrage)
- Built x-feed-scraper.sh — scrolls X feed via xdotool, takes screenshots with ImageMagick
- Pipeline: scrape → screenshot → vision analysis → categorize → verdict (valid/expired/spam/sensationalized)
- Sub-agents run analysis in parallel (2 batches of 4 pages)
- Test run found 2 relevant posts out of ~15: one sensationalized crypto hype, one paid stock promo
- Chrome must be launched with --no-sandbox on this VM
- X cookies are encrypted at rest — browser automation is the reliable free path
- D J's X handle: logged in via desktop Chrome on the VM
5. **Simulate** — Paper trading system
- Virtual bankroll ($1000 default)
- Position tracking, P&L, stop losses, take profits
- Performance stats: win rate, ROI, by-strategy breakdown
## Infrastructure Notes
- pkill chrome kills OpenClaw headless browser too — be careful, causes gateway disruption
- Desktop Chrome and OpenClaw headless Chrome are separate instances
- xdotool installed for keyboard/mouse automation
- ImageMagick `import` works for full-screen captures
- Chrome user data dir: /home/wdjones/.config/google-chrome
### Files
- `skills/deep-scraper/` — scraping skill (SKILL.md + scripts)
- `projects/feed-hunter/` — project home
- `run-pipeline.sh` — full pipeline orchestrator
- `simulator.py` — paper trading CLI
- `investigate.py` — investigation task generator
- `config.json` — pipeline settings
## Session Context
- This session is near compaction
- Major accomplishments today: server recovery, OpenClaw update, Proxmox VNC fix, ChromaDB memory, browser capability, X feed analysis pipeline
### Key Discovery
- Chrome refuses `--remote-debugging-port` when `--user-data-dir` is the default path
- Solution: copy profile to `~/.config/google-chrome-debug/` and launch from there
- Need `--remote-allow-origins=*` for WebSocket CDP access
- Python needs `-u` flag for unbuffered output in pipeline scripts
### First Live Investigation
- @linie_oo claimed @kch123 has ~$10M Polymarket profit
- Verified on Polymarket: $9,371,829 all-time P&L ✅
- 1,862 predictions, $2.3M active positions
- Sent investigation alert to D J with action buttons
### D J's Vision
- Scrape → investigate → verify → simulate → backtest → if viable, spawn working project
- Everything paper-traded first to prove it works
- Backtesting wherever historical data exists
- Web portal to present reports and implementation details
- D J headed to bed ~midnight, asked me to refine overnight + build portal
### Night Shift Plan
- Sub-agent building web portal at localhost:8888
- Refine triage patterns
- Add positions monitoring
- Portal shows: dashboard, feed view, investigations, sim tracker, pipeline status