Files
workspace/memory/2026-02-11.md

3.3 KiB

2026-02-11

KIPP Voice Pipeline — Major Build Session

Built & Deployed (feature/wake-word branch)

  • Always-on wake word detection via OpenWakeWord (hey_jarvis model as placeholder)
  • Faster Whisper (base.en) for speech-to-text on KIPP VM
  • Voice WebSocket server on port 8082 (TLS) — kipp-voice.service
  • Python venv at /home/wdjones/kipp-venv with openwakeword, faster-whisper, websockets, aiohttp
  • Male TTS voice — switched from Amy to Ryan (Piper en_US)
  • Hero panel chat — voice interaction happens inside the greeting/hero card, not a separate overlay
  • Widget state system — JSON file + CLI tool + REST API + dashboard polling
    • tools/widgets.py for shopping list, timers, reminders
    • API endpoints on UI server: GET/POST /api/widgets
    • Dashboard loads real data, polls every 10s
    • KIPP agent instructed in SOUL.md to use widget CLI

Key Bugs Fixed

  1. CSS injected inside JS — patch script found /* CHAT OVERLAY */ in both CSS and JS sections
  2. Gateway challenge-response — must answer connect.challenge with req method connect
  3. Client ID must be openclaw-control-ui — gateway validates this
  4. Origin header required — voice server needs Origin: https://192.168.86.100:8080
  5. Lifecycle event detection — gateway sends phase="end" not state="end" — THIS was the 60-second hang bug
  6. Audio suppressed during wake state — browser stopped sending mic data when it should have been recording
  7. Race condition — server sent ready before TTS finished, mic picked up speaker audio
  8. Self-triggering wake word — KIPP's own TTS voice triggered "hey jarvis" — fixed with 2s cooldown
  9. voiceState stuck on speaking — client must set listening before server's ready msg arrives
  10. Duplicate JS blocks — sub-agent inserted widget code twice

Voice State Machine (final)

listening → (wake word) → recording → (silence) → processing → (gateway) → speaking → (done_speaking) → cooldown (2s) → listening

Timing Config

  • 4s grace period after wake word before silence timeout
  • 1.5s silence after speech to end recording
  • 30s max recording time
  • 2s cooldown after TTS to prevent self-trigger

KIPP Model Switch

  • Switched from llamacpp/glm-4.7-flash (83s responses!) to anthropic/claude-sonnet-4-20250514 (~3s responses)
  • GLM-4 Flash as fallback
  • Config at /home/wdjones/.openclaw/openclaw.json on KIPP VM

15 Playwright Tests

  • kipp-ui/tests/test_voice.py — UI elements, state transitions, chat flow, server connectivity

anoin123 Investigation

  • @browomo tweet about anoin123 Polymarket wallet: $1.6M in 57 days
  • 2-4 AM EST claim is FALSE — trades peak at 3 PM EST
  • Strategy: "No harvester" — buys No at 90-99¢ on time-bounded events, collects spread
  • $2.2M volume, $7K avg trade, concentrated on Iran strikes + government shutdown
  • Monitor set up: anoin123-monitor.py + systemd timer every 5min
  • Analysis at data/investigations/anoin123-analysis.md
  • Copy-trade verdict: medium value — strategy is mechanical and replicable independently

Infrastructure Notes

  • KIPP VM services: kipp-ui, kipp-voice, kipp-tts, kipp-wss-proxy, openclaw-gateway
  • Widget data: /home/wdjones/.openclaw/workspace/kipp-ui/data/widgets.json
  • All changes on feature/wake-word branch in kipp/workspace repo