Feed Hunter: deep scraper skill, pipeline, simulator, first investigation

- Built deep-scraper skill (CDP-based X feed extraction)
- Three-stage pipeline: scrape → triage → investigate
- Paper trading simulator with position tracking
- First live investigation: verified kch123 Polymarket profile ($9.3M P&L)
- Opened first paper position: Seahawks Super Bowl @ 68c
- Telegram alerts with inline action buttons
- Portal build in progress (night shift)
This commit is contained in:
2026-02-07 23:58:40 -06:00
parent b93228ddc2
commit 8638500190
31 changed files with 7752 additions and 40 deletions

View File

@ -0,0 +1,53 @@
# Feed Hunter
Automated X/Twitter feed intelligence pipeline. Scrapes → triages → investigates → simulates.
## Architecture
```
Scrape (CDP) → Triage (claims/links) → Investigate (agent) → Alert (Telegram)
Spawn Project
Simulate / Backtest
```
## Pipeline Stages
1. **Scrape** — Extract structured posts from X feed via Chrome CDP
2. **Triage** — Identify verifiable claims with actionable links
3. **Investigate** — Agent follows links, verifies claims with real data
4. **Alert** — Telegram notification with findings + inline action buttons
5. **Simulate** — Paper trade the strategy, track P&L without real money
6. **Backtest** — Where historical data exists, test against past performance
## Simulation System
Every viable strategy gets a simulated portfolio entry:
- Virtual bankroll (configurable, default $1000)
- Paper positions tracked in `data/simulations/`
- Daily P&L snapshots
- Performance metrics: win rate, ROI, Sharpe ratio, max drawdown
## Project Spawning
When a strategy passes simulation thresholds:
- Auto-scaffold in `projects/<strategy-name>/`
- Working bot code
- Risk parameters
- Go/no-go recommendation
## Schedule
- Feed scrape: Every 2-4 hours during market hours (8am-10pm CST)
- Investigation: Triggered by triage hits
- Simulation updates: Hourly for active positions
- Daily digest: 9am CST summary of all active simulations
## Files
- `config.json` — pipeline settings, thresholds, bankroll
- `data/simulations/` — active paper positions
- `data/backtests/` — historical backtest results
- `data/investigations/` — investigation logs per post
- `data/alerts/` — alert history

View File

@ -0,0 +1,29 @@
{
"pipeline": {
"scrape_pages": 8,
"scrape_port": 9222,
"triage_min_priority": 2,
"investigate_max_per_run": 5
},
"simulation": {
"default_bankroll": 1000,
"max_position_pct": 0.20,
"stop_loss_pct": 0.10,
"currency": "USD"
},
"alerts": {
"channel": "telegram",
"min_signal_score": 0.3,
"notify_on_investigation": true,
"notify_on_sim_entry": true,
"daily_digest_hour": 9
},
"schedule": {
"scrape_hours": [8, 10, 12, 14, 16, 18, 20, 22],
"timezone": "America/Chicago"
},
"backtest": {
"lookback_days": 30,
"min_trades_for_confidence": 10
}
}

View File

@ -0,0 +1,50 @@
{
"id": "inv-20260208-kch123",
"source_post": {
"author": "@linie_oo",
"url": "https://x.com/linie_oo/status/2020141674828034243",
"claim": "polymarket trader who is almost $10,000,000 in profit from sports betting"
},
"investigation": {
"profile_url": "https://polymarket.com/@kch123",
"verified_data": {
"all_time_pnl": "$9,371,829.00",
"positions_value": "$2.3m",
"biggest_win": "$1.1m",
"total_predictions": 1862,
"past_month_pnl": "$3,308,983.50",
"joined": "Jun 2025",
"profile_views": "580.4k"
},
"claim_vs_reality": {
"claimed_profit": "$9,300,000",
"actual_profit": "$9,371,829",
"accuracy": "VERIFIED — actually slightly understated",
"claimed_half_this_month": "$3,308,983.50 past month = ~35% of total (not exactly half but close)",
"active_positions": "$2.3m (post claimed $2.4m — close enough)"
},
"risk_assessment": {
"score": 7,
"notes": [
"Account is real and profitable — verified on-chain via Polymarket",
"Past performance doesn't guarantee future results",
"Sports betting has high variance — $9M profit could swing hard",
"Copy-trading lag: by the time you see + copy, odds may have moved",
"1,862 predictions suggests systematic approach, not luck",
"Concentration risk: $2.3M in active positions is aggressive"
]
},
"verdict": "VERIFIED — profile is real, numbers check out",
"actionable": true,
"strategy_notes": "Could build a copy-bot that monitors kch123's positions via Polymarket API and mirrors trades with configurable delay/sizing. Need to backtest: what would returns look like if you copied with a 5-min/30-min/1-hr delay? Slippage matters."
},
"suggested_simulation": {
"strategy": "polymarket-copy-kch123",
"type": "bet",
"approach": "Mirror kch123 active positions via Polymarket API",
"bankroll": 1000,
"max_position": 200,
"backtest_needed": true,
"backtest_plan": "Pull kch123 historical trades via API, simulate copying with various delays, measure P&L impact of timing lag"
}
}

View File

@ -0,0 +1,24 @@
{
"positions": [
{
"id": "6607b9c1",
"strategy": "polymarket-copy-kch123",
"opened_at": "2026-02-08T05:50:14.328434+00:00",
"type": "bet",
"asset": "Seahawks win Super Bowl 2026",
"entry_price": 0.68,
"size": 200,
"quantity": 1470,
"stop_loss": 0.4,
"take_profit": 1.0,
"current_price": 0.68,
"unrealized_pnl": 0,
"unrealized_pnl_pct": 0,
"source_post": "https://x.com/linie_oo/status/2020141674828034243",
"thesis": "Mirror kch123 largest active position. Seahawks Super Bowl at 68c. If they win, pays $1. kch123 has $9.3M all-time P&L, 1862 predictions. Sports betting specialist.",
"notes": "Paper trade to track if copying kch123 positions is profitable. Entry simulated at current 68c price.",
"updates": []
}
],
"bankroll_used": 200
}

View File

@ -0,0 +1,118 @@
#!/usr/bin/env python3
"""
Investigation report generator for Feed Hunter.
Reads triage.json and produces investigation tasks as structured prompts
for the agent to execute.
Usage:
python3 investigate.py <triage.json> [--output investigations/]
This doesn't do the investigation itself — it generates the task list
that the agent (Case) follows using browser/web tools.
"""
import argparse
import json
import os
from datetime import datetime, timezone
from pathlib import Path
def generate_investigation_prompt(post):
"""Generate an investigation prompt for the agent."""
author = post["author"].get("handle", "unknown")
text = post["text"][:500]
claims = post.get("claims", [])
links = post.get("links", [])
tasks = post.get("tasks", [])
prompt = f"""## Investigation: {author}
**Post:** {text}
**Claims detected:**
"""
for c in claims:
prompt += f"- [{c['type']}] {c['match']}\n"
prompt += f"\n**Links found:**\n"
for l in links:
prompt += f"- [{l['type']}] {l['url']}\n"
prompt += f"\n**Investigation tasks:**\n"
for i, t in enumerate(tasks, 1):
prompt += f"{i}. **{t['action']}**: {t['description']}\n"
prompt += f" Method: {t['method']}\n"
if t.get('url'):
prompt += f" URL: {t['url']}\n"
prompt += """
**Deliver:**
1. Is the claim verifiable? What does the actual data show?
2. Is there recent activity? (Last 24-48h)
3. Is this still actionable or has the window closed?
4. Risk assessment (1-10, where 10 is highest risk)
5. Verdict: ACTIONABLE / EXPIRED / EXAGGERATED / SCAM / UNVERIFIABLE
6. If ACTIONABLE: suggested paper trade parameters (asset, entry, size, stop loss, take profit)
"""
return prompt
def main():
parser = argparse.ArgumentParser()
parser.add_argument("input", help="Path to triage.json")
parser.add_argument("--output", help="Output directory for investigation files")
args = parser.parse_args()
with open(args.input) as f:
data = json.load(f)
queue = data.get("investigation_queue", [])
if not queue:
print("No posts in investigation queue.")
return
output_dir = args.output or os.path.join(os.path.dirname(args.input), "investigations")
os.makedirs(output_dir, exist_ok=True)
timestamp = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
investigations = []
for i, post in enumerate(queue):
inv = {
"id": f"inv-{timestamp}-{i}",
"post_author": post["author"].get("handle", "unknown"),
"post_url": post.get("url", ""),
"priority": post["priority"],
"claims": post.get("claims", []),
"tasks": post.get("tasks", []),
"prompt": generate_investigation_prompt(post),
"status": "pending",
"result": None,
}
investigations.append(inv)
# Save investigation batch
batch_file = os.path.join(output_dir, f"batch-{timestamp}.json")
with open(batch_file, "w") as f:
json.dump({
"batch_id": timestamp,
"created_at": datetime.now(timezone.utc).isoformat(),
"count": len(investigations),
"investigations": investigations,
}, f, indent=2)
print(f"=== Investigation Batch: {timestamp} ===")
print(f"Tasks: {len(investigations)}")
for inv in investigations:
print(f"\n [{inv['priority']}] {inv['post_author']}")
print(f" Claims: {[c['type'] for c in inv['claims']]}")
print(f" Tasks: {len(inv['tasks'])}")
print(f"\nSaved to {batch_file}")
print(f"\nTo execute: agent reads batch file and runs each investigation prompt")
if __name__ == "__main__":
main()

View File

@ -0,0 +1,59 @@
#!/bin/bash
# Feed Hunter — Full pipeline run
# Usage: ./run-pipeline.sh [scroll_pages]
#
# Runs: scrape → triage → generate investigation tasks
# Agent handles investigation + alerts separately
set -e
PAGES=${1:-8}
BASE="/home/wdjones/.openclaw/workspace"
SKILL="$BASE/skills/deep-scraper/scripts"
PROJECT="$BASE/projects/feed-hunter"
DATA="$BASE/data/x-feed"
echo "=== Feed Hunter Pipeline ==="
echo "$(date '+%Y-%m-%d %H:%M:%S %Z')"
# Ensure Chrome is running with debug port
if ! curl -s http://127.0.0.1:9222/json >/dev/null 2>&1; then
echo "Starting Chrome..."
bash "$SKILL/launch-chrome-debug.sh"
fi
# Stage 1: Scrape
echo ""
echo "--- Stage 1: Scrape ($PAGES pages) ---"
python3 -u "$SKILL/scrape-x-feed.py" --port 9222 --scroll-pages "$PAGES"
# Find latest scrape
LATEST=$(ls -dt "$DATA"/20* | head -1)
echo "Latest scrape: $LATEST"
# Stage 2: Triage
echo ""
echo "--- Stage 2: Triage ---"
python3 "$SKILL/triage-posts.py" "$LATEST/posts.json"
# Stage 3: Generate investigation tasks
TRIAGE="$LATEST/triage.json"
if [ -f "$TRIAGE" ]; then
QUEUE=$(python3 -c "import json; d=json.load(open('$TRIAGE')); print(len(d.get('investigation_queue',[])))")
if [ "$QUEUE" -gt 0 ]; then
echo ""
echo "--- Stage 3: Investigation Tasks ---"
python3 "$PROJECT/investigate.py" "$TRIAGE" --output "$LATEST/investigations"
echo ""
echo ">>> $QUEUE posts queued for investigation"
echo ">>> Agent should read: $LATEST/investigations/"
else
echo ""
echo ">>> No posts worth investigating this run."
fi
else
echo ">>> No triage output found."
fi
echo ""
echo "=== Pipeline complete: $LATEST ==="

344
projects/feed-hunter/simulator.py Executable file
View File

@ -0,0 +1,344 @@
#!/usr/bin/env python3
"""
Paper trading simulator for Feed Hunter strategies.
Tracks virtual positions, P&L, and performance metrics.
No real money — everything is simulated.
Usage:
python3 simulator.py status # Show all active sims
python3 simulator.py open <strategy> <details_json> # Open a paper position
python3 simulator.py close <sim_id> <exit_price> # Close a position
python3 simulator.py update <sim_id> <current_price> # Update mark-to-market
python3 simulator.py history # Show closed positions
python3 simulator.py stats # Performance summary
"""
import argparse
import json
import os
import sys
import uuid
from datetime import datetime, timezone
from pathlib import Path
DATA_DIR = Path(__file__).parent / "data" / "simulations"
ACTIVE_FILE = DATA_DIR / "active.json"
HISTORY_FILE = DATA_DIR / "history.json"
CONFIG_FILE = Path(__file__).parent / "config.json"
def load_config():
with open(CONFIG_FILE) as f:
return json.load(f)
def load_active():
if ACTIVE_FILE.exists():
with open(ACTIVE_FILE) as f:
return json.load(f)
return {"positions": [], "bankroll_used": 0}
def save_active(data):
DATA_DIR.mkdir(parents=True, exist_ok=True)
with open(ACTIVE_FILE, "w") as f:
json.dump(data, f, indent=2)
def load_history():
if HISTORY_FILE.exists():
with open(HISTORY_FILE) as f:
return json.load(f)
return {"closed": []}
def save_history(data):
DATA_DIR.mkdir(parents=True, exist_ok=True)
with open(HISTORY_FILE, "w") as f:
json.dump(data, f, indent=2)
def cmd_open(args):
"""Open a new paper position."""
config = load_config()
sim_config = config["simulation"]
active = load_active()
details = json.loads(args.details)
# Calculate position size
bankroll = sim_config["default_bankroll"]
max_pos = bankroll * sim_config["max_position_pct"]
position_size = details.get("size", max_pos)
position_size = min(position_size, max_pos)
sim_id = str(uuid.uuid4())[:8]
position = {
"id": sim_id,
"strategy": args.strategy,
"opened_at": datetime.now(timezone.utc).isoformat(),
"type": details.get("type", "long"), # long, short, bet
"asset": details.get("asset", "unknown"),
"entry_price": details.get("entry_price", 0),
"size": position_size,
"quantity": details.get("quantity", 0),
"stop_loss": details.get("stop_loss"),
"take_profit": details.get("take_profit"),
"current_price": details.get("entry_price", 0),
"unrealized_pnl": 0,
"unrealized_pnl_pct": 0,
"source_post": details.get("source_post", ""),
"thesis": details.get("thesis", ""),
"notes": details.get("notes", ""),
"updates": [],
}
active["positions"].append(position)
active["bankroll_used"] = sum(p["size"] for p in active["positions"])
save_active(active)
print(f"✅ Paper position opened: {sim_id}")
print(f" Strategy: {args.strategy}")
print(f" Asset: {position['asset']}")
print(f" Type: {position['type']}")
print(f" Entry: ${position['entry_price']}")
print(f" Size: ${position_size:.2f}")
if position["stop_loss"]:
print(f" Stop Loss: ${position['stop_loss']}")
if position["take_profit"]:
print(f" Take Profit: ${position['take_profit']}")
def cmd_close(args):
"""Close a paper position."""
active = load_active()
history = load_history()
pos = None
for i, p in enumerate(active["positions"]):
if p["id"] == args.sim_id:
pos = active["positions"].pop(i)
break
if not pos:
print(f"❌ Position {args.sim_id} not found")
sys.exit(1)
exit_price = float(args.exit_price)
entry_price = pos["entry_price"]
if pos["type"] == "long":
pnl_pct = (exit_price - entry_price) / entry_price if entry_price else 0
elif pos["type"] == "short":
pnl_pct = (entry_price - exit_price) / entry_price if entry_price else 0
elif pos["type"] == "bet":
# For binary bets: exit_price is 1 (win) or 0 (lose)
pnl_pct = (exit_price - entry_price) / entry_price if entry_price else 0
else:
pnl_pct = 0
realized_pnl = pos["size"] * pnl_pct
pos["closed_at"] = datetime.now(timezone.utc).isoformat()
pos["exit_price"] = exit_price
pos["realized_pnl"] = round(realized_pnl, 2)
pos["realized_pnl_pct"] = round(pnl_pct * 100, 2)
history["closed"].append(pos)
active["bankroll_used"] = sum(p["size"] for p in active["positions"])
save_active(active)
save_history(history)
emoji = "🟢" if realized_pnl >= 0 else "🔴"
print(f"{emoji} Position closed: {pos['id']}")
print(f" Asset: {pos['asset']}")
print(f" Entry: ${entry_price} → Exit: ${exit_price}")
print(f" P&L: ${realized_pnl:+.2f} ({pnl_pct*100:+.1f}%)")
def cmd_update(args):
"""Update mark-to-market for a position."""
active = load_active()
for pos in active["positions"]:
if pos["id"] == args.sim_id:
current = float(args.current_price)
entry = pos["entry_price"]
if pos["type"] == "long":
pnl_pct = (current - entry) / entry if entry else 0
elif pos["type"] == "short":
pnl_pct = (entry - current) / entry if entry else 0
else:
pnl_pct = (current - entry) / entry if entry else 0
pos["current_price"] = current
pos["unrealized_pnl"] = round(pos["size"] * pnl_pct, 2)
pos["unrealized_pnl_pct"] = round(pnl_pct * 100, 2)
pos["updates"].append({
"time": datetime.now(timezone.utc).isoformat(),
"price": current,
"pnl": pos["unrealized_pnl"],
})
# Check stop loss
if pos.get("stop_loss") and pos["type"] == "long" and current <= pos["stop_loss"]:
print(f"⚠️ STOP LOSS triggered for {pos['id']} at ${current}")
if pos.get("take_profit") and pos["type"] == "long" and current >= pos["take_profit"]:
print(f"🎯 TAKE PROFIT hit for {pos['id']} at ${current}")
save_active(active)
emoji = "🟢" if pos["unrealized_pnl"] >= 0 else "🔴"
print(f"{emoji} Updated {pos['id']}: ${current} ({pos['unrealized_pnl_pct']:+.1f}%)")
return
print(f"❌ Position {args.sim_id} not found")
def cmd_status(args):
"""Show all active positions."""
active = load_active()
config = load_config()
bankroll = config["simulation"]["default_bankroll"]
if not active["positions"]:
print("No active paper positions.")
return
total_unrealized = 0
print(f"=== Active Paper Positions ===")
print(f"Bankroll: ${bankroll} | Used: ${active['bankroll_used']:.2f} | Free: ${bankroll - active['bankroll_used']:.2f}\n")
for pos in active["positions"]:
emoji = "🟢" if pos["unrealized_pnl"] >= 0 else "🔴"
total_unrealized += pos["unrealized_pnl"]
print(f"{emoji} [{pos['id']}] {pos['strategy']}")
print(f" {pos['asset']} | {pos['type']} | Size: ${pos['size']:.2f}")
print(f" Entry: ${pos['entry_price']} → Current: ${pos['current_price']}")
print(f" P&L: ${pos['unrealized_pnl']:+.2f} ({pos['unrealized_pnl_pct']:+.1f}%)")
print(f" Opened: {pos['opened_at'][:16]}")
if pos.get("thesis"):
print(f" Thesis: {pos['thesis'][:80]}")
print()
print(f"Total unrealized P&L: ${total_unrealized:+.2f}")
def cmd_history(args):
"""Show closed positions."""
history = load_history()
if not history["closed"]:
print("No closed positions yet.")
return
print("=== Closed Positions ===\n")
for pos in history["closed"]:
emoji = "🟢" if pos["realized_pnl"] >= 0 else "🔴"
print(f"{emoji} [{pos['id']}] {pos['strategy']}")
print(f" {pos['asset']} | ${pos['entry_price']} → ${pos['exit_price']}")
print(f" P&L: ${pos['realized_pnl']:+.2f} ({pos['realized_pnl_pct']:+.1f}%)")
print(f" {pos['opened_at'][:16]}{pos['closed_at'][:16]}")
print()
def cmd_stats(args):
"""Performance summary across all closed trades."""
history = load_history()
config = load_config()
closed = history.get("closed", [])
if not closed:
print("No completed trades to analyze.")
return
wins = [t for t in closed if t["realized_pnl"] > 0]
losses = [t for t in closed if t["realized_pnl"] <= 0]
total_pnl = sum(t["realized_pnl"] for t in closed)
print("=== Performance Summary ===\n")
print(f"Total trades: {len(closed)}")
print(f"Wins: {len(wins)} | Losses: {len(losses)}")
print(f"Win rate: {len(wins)/len(closed)*100:.1f}%")
print(f"Total P&L: ${total_pnl:+.2f}")
if wins:
avg_win = sum(t["realized_pnl"] for t in wins) / len(wins)
best = max(closed, key=lambda t: t["realized_pnl"])
print(f"Avg win: ${avg_win:+.2f}")
print(f"Best trade: {best['id']} ({best['strategy']}) ${best['realized_pnl']:+.2f}")
if losses:
avg_loss = sum(t["realized_pnl"] for t in losses) / len(losses)
worst = min(closed, key=lambda t: t["realized_pnl"])
print(f"Avg loss: ${avg_loss:+.2f}")
print(f"Worst trade: {worst['id']} ({worst['strategy']}) ${worst['realized_pnl']:+.2f}")
# ROI
bankroll = config["simulation"]["default_bankroll"]
roi = (total_pnl / bankroll) * 100
print(f"\nROI on ${bankroll} bankroll: {roi:+.1f}%")
# By strategy
strategies = {}
for t in closed:
s = t["strategy"]
if s not in strategies:
strategies[s] = {"trades": 0, "pnl": 0, "wins": 0}
strategies[s]["trades"] += 1
strategies[s]["pnl"] += t["realized_pnl"]
if t["realized_pnl"] > 0:
strategies[s]["wins"] += 1
if len(strategies) > 1:
print(f"\n=== By Strategy ===")
for name, data in sorted(strategies.items(), key=lambda x: x[1]["pnl"], reverse=True):
wr = data["wins"] / data["trades"] * 100
print(f" {name}: {data['trades']} trades, {wr:.0f}% WR, ${data['pnl']:+.2f}")
def main():
parser = argparse.ArgumentParser(description="Feed Hunter Paper Trading Simulator")
sub = parser.add_subparsers(dest="command")
sub.add_parser("status", help="Show active positions")
p_open = sub.add_parser("open", help="Open paper position")
p_open.add_argument("strategy", help="Strategy name")
p_open.add_argument("details", help="JSON with position details")
p_close = sub.add_parser("close", help="Close position")
p_close.add_argument("sim_id", help="Position ID")
p_close.add_argument("exit_price", help="Exit price")
p_update = sub.add_parser("update", help="Update mark-to-market")
p_update.add_argument("sim_id", help="Position ID")
p_update.add_argument("current_price", help="Current price")
sub.add_parser("history", help="Closed positions")
sub.add_parser("stats", help="Performance summary")
args = parser.parse_args()
if args.command == "status":
cmd_status(args)
elif args.command == "open":
cmd_open(args)
elif args.command == "close":
cmd_close(args)
elif args.command == "update":
cmd_update(args)
elif args.command == "history":
cmd_history(args)
elif args.command == "stats":
cmd_stats(args)
else:
parser.print_help()
if __name__ == "__main__":
main()