Prediction market agent — autonomous workflows.
End-to-end agent template: read state → reason → act, with calibrated views as the bridge.
A reference architecture for a self-hostable prediction market agent on Kalshi and Polymarket. Five stages — gather, reason, decide, execute, reconcile — wired around the platform's public surfaces. Bring your own model, bring your own risk policy; the platform supplies the read surfaces, normalized intents, audit log, and the calibration loop that closes Brier feedback into the next cycle. Hub view at/ai-agents; hosted variant at/portfolio-autopilot.
Five stages · four templates · calibration loopBYOM · BYOR · self-host
Galileo at the telescope — the first quantitative forecaster, alone with calibrated instruments.
Five-stage reference architecture
Every production prediction market agent on the platform composes from these five stages. Stages 1, 4, and 5 use platform surfaces; stages 2 and 3 are operator-owned (BYOM and BYOR). The shape is intentionally minimal.
01/world · /event-probability-api · /realtime-data-api · /query-gov · /query-econ02Operator-owned model layer (BYOM)03Operator-owned policy layer (BYOR)04/prediction-market-execution05Reconciliation feed (CSV / JSON / Parquet)Four concrete workflow templates
Production patterns from the field. Each template runs the five-stage loop with different cadence, sizing, and exit conditions. Copy any of them as a starting point.
Event-trigger agent
Watch a basket of contracts and place a sized intent the moment a price condition fires
01Subscribe via WebSocket to ticker:* topics for the watch basket02Maintain a thin in-memory model of price + spread + recent flow per ticker03When a ticker crosses an operator-defined trigger (price below X, spread under Y), invoke the LLM with the ticker context to produce a sized view04Submit a dry-run intent first; if the platform's risk-gate report is clean, submit live with an idempotency key05Log the LLM rationale + intent record to a local audit file for post-trade review
Drawdown-guard agent
Continuously evaluate portfolio risk; reduce or halt on drawdown thresholds
01Pull positions + balance + recent fills via the agentic CLI on a 5-minute cadence02Compute peak-to-current drawdown across the full book and per-strategy03If drawdown exceeds the operator-set threshold, the LLM authors a reduction plan (which positions, in what order)04Execute the reduction as a sequence of sell intents with conservative limit prices05Halt new buy intents until drawdown recovers to a separate operator-set threshold
Daily-research agent
Author a daily research note + trade-idea list for the operator to review
01At a fixed local hour, pull the world snapshot + delta-since-yesterday + top movers02Cross-reference with /query-gov + /query-econ for relevant policy and macro context03The LLM produces a short markdown note with named theses, suggested trade ideas, and confidence levels04Optionally render trade ideas as dry-run intent records the operator can flip live05Email or post the note; the agent does not act until the operator approves
Hedge-finder agent
Map a real-portfolio exposure to a basket of binary contracts with cost / coverage tradeoffs
01Read the portfolio exposure as input (position file, NAV report, or simple JSON)02Search Kalshi + Polymarket for contracts that map to the named risk dimensions03The LLM proposes a basket — leg sizes, total cost, and which exposure each leg covers04Produce a one-page hedge proposal with the mapping table and dry-run intent records05Operator reviews; on approval, the basket is submitted as a sequence of intents with shared idempotency prefix
Self-host vs hosted Portfolio Autopilot
Same architecture, two operating postures. Self-host when every layer needs to live in the operator's code. Hosted when the agent loop should run as a service. Both share the same intent + risk-gate stack underneath.
Calibration loop — how the agent learns
Agents that act on probability must close the feedback loop on their own calibration. Five-stage chain — view, outcome, score, update, audit — designed so every step is queryable.
01View
The agent emits a calibrated view (probability + confidence) every time it acts. Views are stored alongside the intent record.
02Outcome
Every prediction market contract resolves to a binary outcome. The platform records the resolution against each linked view.
03Brier feedback
A Brier-style score (or its multi-class generalization) is computed for the agent's views over a rolling window — by topic, by horizon, by venue.
04Update
The next cycle's prompt sees the agent's recent calibration. Persistently overconfident topics get downweighted; persistently well-calibrated topics get more aggressive sizing.
05Audit
The full chain — view → outcome → Brier — is queryable per agent, per topic, per period. The calibration loop is auditable, not implicit.
Methodology + working notes at /papers.
Read next from the library
Matched from SimpleFunctions blog, opinions, technical guides, concepts, and learn pages.
Automated Prediction Market Trading: Architecture and Cost Breakdown
Detailed cost breakdown for automated prediction market trading. LLM evaluation costs, Tavily API spend, Kalshi fees, and total monthly cost per thesis. Compare DIY agent vs SimpleFunctions.
How to Build a Prediction Market Trading Bot in 2026
Technical guide for TypeScript developers building prediction market trading bots. Thesis-driven architecture, Kelly criterion sizing, SimpleFunctions CLI/MCP/REST integration, and cost breakdown.
Build a Prediction Market Agent with LangChain + SimpleFunctions
Step-by-step guide to building an autonomous prediction market trading agent using LangChain and SimpleFunctions. Python code examples for Kalshi and Polymarket.
Setting Up Your First Prediction Market Agent with SimpleFunctions
Step-by-step guide to setting up a prediction market trading agent with SimpleFunctions CLI. From sf setup to your first scan, thesis, and edge detection on Kalshi and Polymarket.
Piping prediction market signals into your existing trading system
Three integration patterns for piping Kalshi and Polymarket data into existing trading infrastructure: cron polling, agent middleware, and thesis-as-filter.
How to Build an OpenClaw Prediction Market Bot with SimpleFunctions
Technical tutorial for building an OpenClaw prediction market trading bot with SimpleFunctions. Three tool endpoints, structured JSON responses, thesis-driven strategies.
FAQ
What is a prediction market agent?
An autonomous program that reads the prediction market world (Kalshi + Polymarket), reasons about it with an LLM or other model, and optionally acts through normalized execution intents. The agent runs in a loop — gather → reason → decide → execute → reconcile — usually until an operator-defined exit condition or schedule.
How is this different from /portfolio-autopilot?
This page is the reference architecture for a self-hostable agent — the template you copy and adapt. /portfolio-autopilot is the hosted variant: the same architecture run as a service, with curated model and prompts, default risk gates, and platform-side audit. Same intent + risk-gate stack underneath; different operating posture.
Can I bring my own model?
Yes. BYOM is first-class. The reference architecture treats the model layer as opaque — the agent calls a function that takes context and returns a view. Implementations exist for Claude, GPT, Gemini, open-weight (Llama, Qwen, Mistral), fine-tunes, and ensembles. The operator owns the prompt, the temperature, and the routing.
How does the agent learn?
Through the calibration loop. Every emitted view is paired with the eventual binary outcome; a Brier-style feedback score is computed over rolling windows by topic, horizon, and venue. The next cycle's prompt receives the agent's recent calibration as context, so persistently overconfident areas get downweighted. Learning is auditable — the chain from view to outcome to score is queryable.
Where does the agent run?
Anywhere. Local machine, container, cron job, Lambda, Cloudflare Worker, Trigger.dev task, GitHub Action — the reference architecture is transport-agnostic. The platform is reachable via HTTPS REST + WebSocket; the agentic CLI runs anywhere Node 18+ runs. Self-host wherever the operator's ops posture lives.
Self-host or hosted?
Self-host when the operator wants every layer in their own code — model choice, prompt versioning, risk policy authoring, log retention. Hosted (/portfolio-autopilot) when the operator wants the agent loop as a service and is happy to configure rather than author. Many desks start self-hosted to learn the surface, then migrate hosted strategies they want supervised.
What tools does the agent use?
The same tools published on /ai-agents and exposed via /api/tools — world snapshot, probability, search, screen, indicators, gov + econ overlay, intent submit, intent watch, reconciliation. The exact catalog is published live; this page deliberately does not hardcode a count.
How is the agent evaluated?
Three layers. (1) Calibration: rolling Brier-style score on emitted views, by topic and horizon. (2) P&L: standard portfolio-level — realized + unrealized, against operator-set benchmarks. (3) Behavior: did the agent respect risk gates, did it dry-run before live, did it use idempotency keys. The /calibration surface and the agent's local audit log together cover all three.
How is the agent audited?
Two-sided audit. Locally, the agent writes a structured log of every prompt, response, and resulting intent. Server-side, the platform writes an immutable record of every intent, risk-gate evaluation, and venue submission. Both sides can be exported for compliance review; the platform side is the system of record for capital movement.
How does safety work?
Risk gates run before every intent — size cap, exposure cap, drawdown ceiling, regime gate, daily-loss cap, dry-run toggle. Operators author and version-control these in code; the platform evaluates them at execution time and rejects intents that fail. Idempotency keys make replays safe. Dry-run validates the full pipeline without capital at risk.
What if the LLM hallucinates a market?
The agent should not act on a market it has not verified against /api/public/scan or /api/public/market/:ticker. The reference architecture treats the model output as proposing a market id, then the agent looks it up and rejects the action if the id does not resolve. The intent submit endpoint also rejects unknown tickers — defense in depth.
Related surfaces
AI agents on prediction markets
Hub — four product surfaces and the host matrix. Read this to orient.
Agentic usage
Patterns + worked examples for AI agents on the four surfaces.
Portfolio Autopilot
Hosted variant — the same architecture run as a service.
Prediction market execution
Intents, triggers, routing, monitoring — the execution surface the agent submits to.
Agentic CLI
sf binary as an agent control plane — JSON output, idempotency, dry-run safety.
World state
Calibrated 15-minute snapshot — the gather-stage source of truth.
Papers
Methodology behind cross-venue normalization, world-model, and calibration.