SimpleFunctions

Prediction market agent — autonomous workflows.

End-to-end agent template: read state → reason → act, with calibrated views as the bridge.

A reference architecture for a self-hostable prediction market agent on Kalshi and Polymarket. Five stages — gather, reason, decide, execute, reconcile — wired around the platform's public surfaces. Bring your own model, bring your own risk policy; the platform supplies the read surfaces, normalized intents, audit log, and the calibration loop that closes Brier feedback into the next cycle. Hub view at/ai-agents; hosted variant at/portfolio-autopilot.

Five stages · four templates · calibration loopBYOM · BYOR · self-host
Realist oil painting in Wright of Derby chiaroscuro style — Galileo at the telescope, alone with calibrated instruments

Galileo at the telescope — the first quantitative forecaster, alone with calibrated instruments.

Five-stage reference architecture

Every production prediction market agent on the platform composes from these five stages. Stages 1, 4, and 5 use platform surfaces; stages 2 and 3 are operator-owned (BYOM and BYOR). The shape is intentionally minimal.

#
Stage
What it does
Surface
01
Gather
Pull current world state, target-market detail, and any new signals (X chatter, news, gov / econ overlay)
/world · /event-probability-api · /realtime-data-api · /query-gov · /query-econ
02
Reason
The agent (Claude, GPT, custom) consumes the gathered context and applies the operator's thesis or model — produces calibrated views
Operator-owned model layer (BYOM)
03
Decide
Convert views into proposed actions — buy / sell / hedge / hold — with size and trigger conditions; runs the operator's risk policy
Operator-owned policy layer (BYOR)
04
Execute
Submit normalized intents through the platform execution layer; idempotency keys + dry-run + audit log
/prediction-market-execution
05
Reconcile
Pull fills + settlements back, mark to current price, update the agent's view of the world for the next cycle
Reconciliation feed (CSV / JSON / Parquet)

Four concrete workflow templates

Production patterns from the field. Each template runs the five-stage loop with different cadence, sizing, and exit conditions. Copy any of them as a starting point.

Event-trigger agent

Watch a basket of contracts and place a sized intent the moment a price condition fires

  1. 01Subscribe via WebSocket to ticker:* topics for the watch basket
  2. 02Maintain a thin in-memory model of price + spread + recent flow per ticker
  3. 03When a ticker crosses an operator-defined trigger (price below X, spread under Y), invoke the LLM with the ticker context to produce a sized view
  4. 04Submit a dry-run intent first; if the platform's risk-gate report is clean, submit live with an idempotency key
  5. 05Log the LLM rationale + intent record to a local audit file for post-trade review

Drawdown-guard agent

Continuously evaluate portfolio risk; reduce or halt on drawdown thresholds

  1. 01Pull positions + balance + recent fills via the agentic CLI on a 5-minute cadence
  2. 02Compute peak-to-current drawdown across the full book and per-strategy
  3. 03If drawdown exceeds the operator-set threshold, the LLM authors a reduction plan (which positions, in what order)
  4. 04Execute the reduction as a sequence of sell intents with conservative limit prices
  5. 05Halt new buy intents until drawdown recovers to a separate operator-set threshold

Daily-research agent

Author a daily research note + trade-idea list for the operator to review

  1. 01At a fixed local hour, pull the world snapshot + delta-since-yesterday + top movers
  2. 02Cross-reference with /query-gov + /query-econ for relevant policy and macro context
  3. 03The LLM produces a short markdown note with named theses, suggested trade ideas, and confidence levels
  4. 04Optionally render trade ideas as dry-run intent records the operator can flip live
  5. 05Email or post the note; the agent does not act until the operator approves

Hedge-finder agent

Map a real-portfolio exposure to a basket of binary contracts with cost / coverage tradeoffs

  1. 01Read the portfolio exposure as input (position file, NAV report, or simple JSON)
  2. 02Search Kalshi + Polymarket for contracts that map to the named risk dimensions
  3. 03The LLM proposes a basket — leg sizes, total cost, and which exposure each leg covers
  4. 04Produce a one-page hedge proposal with the mapping table and dry-run intent records
  5. 05Operator reviews; on approval, the basket is submitted as a sequence of intents with shared idempotency prefix

Self-host vs hosted Portfolio Autopilot

Same architecture, two operating postures. Self-host when every layer needs to live in the operator's code. Hosted when the agent loop should run as a service. Both share the same intent + risk-gate stack underneath.

Self-host (this template)
Hosted (/portfolio-autopilot)
Where it runs
Operator's machine, container, or cloud — wherever the agent loop fits
SimpleFunctions hosted runtime; LLM calls and orchestration handled by the platform
Model freedom
Bring any model — Claude, GPT, Gemini, open-weight, fine-tunes — operator owns the prompt
Curated model + prompt; updated as part of the platform
Risk policy authoring
Operator authors and version-controls every gate; full visibility
Default risk gates + per-fund overrides via configuration
Ops burden
Operator handles uptime, log rotation, alerting, model billing
Platform handles uptime + observability; operator gets reports
Cost shape
Direct LLM costs + ops time; no platform usage fee for the agent itself (just the API tier)
Subscription + usage; explicit budget via /portfolio-autopilot configuration
Audit
Local audit file + platform-side intent log; both available for compliance review
Platform audit log is the system of record; local export on demand
Best for
Quants and developers who want every layer in their own code
Operators who want the agent loop as a service — see /portfolio-autopilot

Calibration loop — how the agent learns

Agents that act on probability must close the feedback loop on their own calibration. Five-stage chain — view, outcome, score, update, audit — designed so every step is queryable.

01

View

The agent emits a calibrated view (probability + confidence) every time it acts. Views are stored alongside the intent record.

02

Outcome

Every prediction market contract resolves to a binary outcome. The platform records the resolution against each linked view.

03

Brier feedback

A Brier-style score (or its multi-class generalization) is computed for the agent's views over a rolling window — by topic, by horizon, by venue.

04

Update

The next cycle's prompt sees the agent's recent calibration. Persistently overconfident topics get downweighted; persistently well-calibrated topics get more aggressive sizing.

05

Audit

The full chain — view → outcome → Brier — is queryable per agent, per topic, per period. The calibration loop is auditable, not implicit.

Methodology + working notes at /papers.

Read next from the library

Matched from SimpleFunctions blog, opinions, technical guides, concepts, and learn pages.

Browse library
Technicalarchitecture

Automated Prediction Market Trading: Architecture and Cost Breakdown

Detailed cost breakdown for automated prediction market trading. LLM evaluation costs, Tavily API spend, Kalshi fees, and total monthly cost per thesis. Compare DIY agent vs SimpleFunctions.

Blogtech

How to Build a Prediction Market Trading Bot in 2026

Technical guide for TypeScript developers building prediction market trading bots. Thesis-driven architecture, Kelly criterion sizing, SimpleFunctions CLI/MCP/REST integration, and cost breakdown.

Technicalguide

Build a Prediction Market Agent with LangChain + SimpleFunctions

Step-by-step guide to building an autonomous prediction market trading agent using LangChain and SimpleFunctions. Python code examples for Kalshi and Polymarket.

Technicalguide

Setting Up Your First Prediction Market Agent with SimpleFunctions

Step-by-step guide to setting up a prediction market trading agent with SimpleFunctions CLI. From sf setup to your first scan, thesis, and edge detection on Kalshi and Polymarket.

Technicalguide

Piping prediction market signals into your existing trading system

Three integration patterns for piping Kalshi and Polymarket data into existing trading infrastructure: cron polling, agent middleware, and thesis-as-filter.

Blogtech

How to Build an OpenClaw Prediction Market Bot with SimpleFunctions

Technical tutorial for building an OpenClaw prediction market trading bot with SimpleFunctions. Three tool endpoints, structured JSON responses, thesis-driven strategies.

FAQ

What is a prediction market agent?

An autonomous program that reads the prediction market world (Kalshi + Polymarket), reasons about it with an LLM or other model, and optionally acts through normalized execution intents. The agent runs in a loop — gather → reason → decide → execute → reconcile — usually until an operator-defined exit condition or schedule.

How is this different from /portfolio-autopilot?

This page is the reference architecture for a self-hostable agent — the template you copy and adapt. /portfolio-autopilot is the hosted variant: the same architecture run as a service, with curated model and prompts, default risk gates, and platform-side audit. Same intent + risk-gate stack underneath; different operating posture.

Can I bring my own model?

Yes. BYOM is first-class. The reference architecture treats the model layer as opaque — the agent calls a function that takes context and returns a view. Implementations exist for Claude, GPT, Gemini, open-weight (Llama, Qwen, Mistral), fine-tunes, and ensembles. The operator owns the prompt, the temperature, and the routing.

How does the agent learn?

Through the calibration loop. Every emitted view is paired with the eventual binary outcome; a Brier-style feedback score is computed over rolling windows by topic, horizon, and venue. The next cycle's prompt receives the agent's recent calibration as context, so persistently overconfident areas get downweighted. Learning is auditable — the chain from view to outcome to score is queryable.

Where does the agent run?

Anywhere. Local machine, container, cron job, Lambda, Cloudflare Worker, Trigger.dev task, GitHub Action — the reference architecture is transport-agnostic. The platform is reachable via HTTPS REST + WebSocket; the agentic CLI runs anywhere Node 18+ runs. Self-host wherever the operator's ops posture lives.

Self-host or hosted?

Self-host when the operator wants every layer in their own code — model choice, prompt versioning, risk policy authoring, log retention. Hosted (/portfolio-autopilot) when the operator wants the agent loop as a service and is happy to configure rather than author. Many desks start self-hosted to learn the surface, then migrate hosted strategies they want supervised.

What tools does the agent use?

The same tools published on /ai-agents and exposed via /api/tools — world snapshot, probability, search, screen, indicators, gov + econ overlay, intent submit, intent watch, reconciliation. The exact catalog is published live; this page deliberately does not hardcode a count.

How is the agent evaluated?

Three layers. (1) Calibration: rolling Brier-style score on emitted views, by topic and horizon. (2) P&L: standard portfolio-level — realized + unrealized, against operator-set benchmarks. (3) Behavior: did the agent respect risk gates, did it dry-run before live, did it use idempotency keys. The /calibration surface and the agent's local audit log together cover all three.

How is the agent audited?

Two-sided audit. Locally, the agent writes a structured log of every prompt, response, and resulting intent. Server-side, the platform writes an immutable record of every intent, risk-gate evaluation, and venue submission. Both sides can be exported for compliance review; the platform side is the system of record for capital movement.

How does safety work?

Risk gates run before every intent — size cap, exposure cap, drawdown ceiling, regime gate, daily-loss cap, dry-run toggle. Operators author and version-control these in code; the platform evaluates them at execution time and rejects intents that fail. Idempotency keys make replays safe. Dry-run validates the full pipeline without capital at risk.

What if the LLM hallucinates a market?

The agent should not act on a market it has not verified against /api/public/scan or /api/public/market/:ticker. The reference architecture treats the model output as proposing a market id, then the agent looks it up and rejects the action if the id does not resolve. The intent submit endpoint also rejects unknown tickers — defense in depth.

Related surfaces