Feed Hunter: deep scraper skill, pipeline, simulator, first investigation
- Built deep-scraper skill (CDP-based X feed extraction) - Three-stage pipeline: scrape → triage → investigate - Paper trading simulator with position tracking - First live investigation: verified kch123 Polymarket profile ($9.3M P&L) - Opened first paper position: Seahawks Super Bowl @ 68c - Telegram alerts with inline action buttons - Portal build in progress (night shift)
This commit is contained in:
@ -1,49 +1,71 @@
|
||||
# 2026-02-07
|
||||
# 2026-02-07 — Server Recovery + Feed Hunter
|
||||
|
||||
## Server Recovery
|
||||
- Server was down for 7 days (01-31 to 02-07)
|
||||
- D J got it back up, we recovered cleanly
|
||||
- Time capsule from 01-31 opened on schedule
|
||||
- Back online after 7-day outage (01-31 to 02-07)
|
||||
- Updated OpenClaw v2026.2.6-3
|
||||
- Fixed Proxmox noVNC: disabled Wayland, switched to X11
|
||||
- Enabled auto-login for wdjones in GDM
|
||||
|
||||
## Updates Applied
|
||||
- OpenClaw updated to 2026.2.6-3
|
||||
- Fixed Proxmox noVNC issue (Wayland → X11)
|
||||
- Enabled auto-login for wdjones
|
||||
## ChromaDB + Browser Setup
|
||||
- ChromaDB collection `openclaw-memory` with cosine distance
|
||||
- chromadb-memory plugin working with auto-recall
|
||||
- Google Chrome installed, headless browser pipeline verified
|
||||
- Sub-agent spawning tested
|
||||
|
||||
## New Infrastructure
|
||||
- ChromaDB running on LXC at 192.168.86.25:8000
|
||||
- Ollama at 192.168.86.137:11434 (qwen3:8b, qwen3:30b, glm-4.7-flash, nomic-embed-text)
|
||||
- chromadb-memory plugin live with auto-recall
|
||||
- 9 documents indexed for semantic memory search
|
||||
## Feed Hunter Project (NEW)
|
||||
Built a full X/Twitter feed intelligence pipeline:
|
||||
|
||||
## Browser Capability
|
||||
- Installed Google Chrome for headless screenshots
|
||||
- OpenClaw browser tool configured and working
|
||||
- Can open URLs, screenshot, analyze with vision
|
||||
- D J wants this used to visually verify web projects before delivery
|
||||
### Architecture
|
||||
1. **Scrape** — CDP-based DOM extraction (not screenshots)
|
||||
- Chrome launched with `--remote-debugging-port=9222 --remote-allow-origins=*`
|
||||
- Must use copied profile (chrome-debug) — Chrome refuses debug port on default profile path
|
||||
- Extracts: author, text, timestamp, metrics, links, media, cards, repost info
|
||||
|
||||
2. **Triage** — Pattern matching for verifiable claims
|
||||
- Performance claims, copy trading, arbitrage, prediction markets, price targets, airdrops
|
||||
- Priority scoring, investigation task generation
|
||||
|
||||
3. **Investigate** — Agent follows links, verifies claims
|
||||
- Uses browser tool to pull real data from Polymarket, exchanges, etc.
|
||||
- Generates verdicts: ACTIONABLE / EXPIRED / EXAGGERATED / SCAM / UNVERIFIABLE
|
||||
|
||||
## Key Decisions
|
||||
- D J wants local LLM (Qwen) as Claude fallback for cost/insurance
|
||||
- Ollama setup for Qwen still pending (model routing config)
|
||||
- Browser visual QA is standard workflow going forward
|
||||
4. **Alert** — Telegram notifications with inline action buttons
|
||||
- Simulate This / Backtest First / Skip
|
||||
|
||||
## X Feed Analysis Project
|
||||
- D J wants automated analysis of X/Twitter posts about money-making (crypto, trading, polymarket, arbitrage)
|
||||
- Built x-feed-scraper.sh — scrolls X feed via xdotool, takes screenshots with ImageMagick
|
||||
- Pipeline: scrape → screenshot → vision analysis → categorize → verdict (valid/expired/spam/sensationalized)
|
||||
- Sub-agents run analysis in parallel (2 batches of 4 pages)
|
||||
- Test run found 2 relevant posts out of ~15: one sensationalized crypto hype, one paid stock promo
|
||||
- Chrome must be launched with --no-sandbox on this VM
|
||||
- X cookies are encrypted at rest — browser automation is the reliable free path
|
||||
- D J's X handle: logged in via desktop Chrome on the VM
|
||||
5. **Simulate** — Paper trading system
|
||||
- Virtual bankroll ($1000 default)
|
||||
- Position tracking, P&L, stop losses, take profits
|
||||
- Performance stats: win rate, ROI, by-strategy breakdown
|
||||
|
||||
## Infrastructure Notes
|
||||
- pkill chrome kills OpenClaw headless browser too — be careful, causes gateway disruption
|
||||
- Desktop Chrome and OpenClaw headless Chrome are separate instances
|
||||
- xdotool installed for keyboard/mouse automation
|
||||
- ImageMagick `import` works for full-screen captures
|
||||
- Chrome user data dir: /home/wdjones/.config/google-chrome
|
||||
### Files
|
||||
- `skills/deep-scraper/` — scraping skill (SKILL.md + scripts)
|
||||
- `projects/feed-hunter/` — project home
|
||||
- `run-pipeline.sh` — full pipeline orchestrator
|
||||
- `simulator.py` — paper trading CLI
|
||||
- `investigate.py` — investigation task generator
|
||||
- `config.json` — pipeline settings
|
||||
|
||||
## Session Context
|
||||
- This session is near compaction
|
||||
- Major accomplishments today: server recovery, OpenClaw update, Proxmox VNC fix, ChromaDB memory, browser capability, X feed analysis pipeline
|
||||
### Key Discovery
|
||||
- Chrome refuses `--remote-debugging-port` when `--user-data-dir` is the default path
|
||||
- Solution: copy profile to `~/.config/google-chrome-debug/` and launch from there
|
||||
- Need `--remote-allow-origins=*` for WebSocket CDP access
|
||||
- Python needs `-u` flag for unbuffered output in pipeline scripts
|
||||
|
||||
### First Live Investigation
|
||||
- @linie_oo claimed @kch123 has ~$10M Polymarket profit
|
||||
- Verified on Polymarket: $9,371,829 all-time P&L ✅
|
||||
- 1,862 predictions, $2.3M active positions
|
||||
- Sent investigation alert to D J with action buttons
|
||||
|
||||
### D J's Vision
|
||||
- Scrape → investigate → verify → simulate → backtest → if viable, spawn working project
|
||||
- Everything paper-traded first to prove it works
|
||||
- Backtesting wherever historical data exists
|
||||
- Web portal to present reports and implementation details
|
||||
- D J headed to bed ~midnight, asked me to refine overnight + build portal
|
||||
|
||||
### Night Shift Plan
|
||||
- Sub-agent building web portal at localhost:8888
|
||||
- Refine triage patterns
|
||||
- Add positions monitoring
|
||||
- Portal shows: dashboard, feed view, investigations, sim tracker, pipeline status
|
||||
|
||||
Reference in New Issue
Block a user