Core Concepts

Name: SimpleFunctions
Author: SimpleFunctions

The mental model behind SimpleFunctions.

Causal Tree

Your thesis is decomposed into a tree of verifiable assumptions. Each node has a probability (0-1) and importance weight. The overall confidence is the weighted product.

Thesis: "Oil stays above $100 for 6 months"
├── n1: OPEC maintains production cuts (0.70, weight 0.30)
│   ├── n1.1: Saudi compliance remains high (0.80)
│   └── n1.2: Russia doesn't break quota (0.60)
├── n2: Demand stays strong (0.65, weight 0.25)
├── n3: Geopolitical risk premium persists (0.75, weight 0.25)
└── n4: No US SPR release (0.80, weight 0.20)
Confidence: 72%

Nodes can be mutated directly with sf whatif --set "n1=0.3" for instant scenario analysis (zero LLM cost). The tree grows over time via weekly augmentation.

Edges

An edge is the difference between what the market prices and what your causal model implies.

Market price: 34c — what traders think

Thesis price: 55c — what your causal model implies

Edge: +21c

Executable edge: +18c — after half the spread

The system classifies each edge by why the mispricing exists:

consensus_gap — market and thesis disagree on fundamental probability

attention_gap — market hasn't reacted to recent information

timing_gap — market prices short-term risk, thesis prices long-term outcome

risk_premium — market embeds fear/greed premium that thesis doesn't

Indicator Framework

The pricing layer between raw price scan and LLM thesis edges. Indicators are cheap math labels — pure functions over the latest price snapshot, no LLM round-trip required for the screening pass itself. Use them to bulk-discover candidates before paying any LLM cost, then hand the survivors to the causal-tree evaluator.

Three-layer architecture:

scan-prices cron     →   indicator screen       →   thesis evaluator
(50K row snapshot)       (pure compute, ~50ms)      (LLM, $$$)
   raw prices            IY/CRI/EE/LAS/OR/τ         causal tree + edges
   price history         RV/VR/IAR (from 48h hist)  full narrative
   event calendar        Adj IY / Residual VR        null-as-signal selectors

Eleven indicators (Tier A–E):

Indicator	Formula	What it catches
IY implied yield	(1/p − 1) × (365/τ)	Long-tail annualized yield. Try iy_min=200 for the unloved tail.
CRI cliff risk	max(p,1−p) / min(p,1−p)	1 = balanced, ∞ = cliff. High CRI → asymmetric payoff, fragile to small news.
EE expected edge	thesisPrice − marketPrice	Expected mispricing in cents. Requires a thesis or regime row attached.
LAS liquidity-adjusted spread	(ask − bid) / mid	Frictional cost. Try las_max=0.05 to drop wide-spread illiquid traps.
OR overround	Σ ask_i − 1	Sum of YES asks across mutually-exclusive event legs. 0.05 = 105¢ field — book-maker margin or live arb.
τ time to expiry	closeTime − now	Days to settlement. Drives IY denominator and Kelly horizon sizing.
RV realized volatility	σ(Δp/p) × √(obs/yr)	Annualized stddev of returns from 48h price history. How much the market is actually moving.
VR vol ratio	RV / √(p(1−p)/τ×365)	Fraction of theoretical max vol consumed. >0.8 very active; <0.1 dead market or consensus.
IAR info arrival rate	count(\|Δ\|≥1c) / hours	Meaningful price changes per hour. Direct proxy for information flow rate.
Adj IY risk-adjusted yield	IY × min(1,VR/0.3) × (1-LAS)	IY penalized for dead markets (low VR) and high friction (high spread). Eliminates false positives.
Residual VR unexplained volatility	VR - exp((14-d)/7)	VR minus expected VR from scheduled catalysts (FOMC, CPI, GDP, NFP, PCE). Positive = market knows something the public calendar doesn't.

Null is signal

The screen treats missing data as a positive selector, not as noise to filter. Two flags:

no_thesis=true — markets without any active thesis (the unloved long tail; strategy 2/3 entry condition)
no_orderbook=true — markets without recent orderbook attention (no maker has quoted, edge is one-sided)

The reverse flags has_thesis / has_orderbook are also positive selectors when you want covered universe only.

Three CLI recipes

# Long-tail yield: short-dated, high IY, no thesis covering it
sf screen --iy-min 200 --tau-max 7 --without-thesis

# Arb detection: event-leg overround above 5%
sf screen --or-min 0.05

# Unloved Polymarket: no thesis, no orderbook attention
sf screen --without-thesis --without-orderbook --venue polymarket

Same filters available via GET /api/public/screen for HTTP clients, and via the screen_markets tool for MCP / OpenAI function-calling / sf agent runtimes.

Regime — Adverse Selection

The regime score answers one question: if I post a quote in this market, what's the probability the next person to hit me knows something I don't? It is the adverse-selection prior — a structural property of the market, not a prediction about the current price.

Score range: 0 → 1

Label: maker < 0.3 · neutral 0.3–0.6 · taker > 0.6

Use: find markets safe for making (low score) or ripe for taking when you have edge (high score).

Weighted score (dynamic — missing inputs redistribute):

score = 0.30·micro + 0.25·calendar + 0.25·prior + 0.10·crossVenue + 0.10·edge

prior         — static LLM-classified asPrior [0.05, 0.80] per market
micro         — spread pct + depth change + volume z + flow (top-N overlay)
calendar      — exp(−hoursToCatalyst/24) × typeMultiplier
crossVenue    — abs(kalshi−polymarket) gap in cents
edge          — abs(sfEdgeCents) from thesis or screen

The prior is the workhorse

The dominant signal for most markets is the static asPrior — an LLM classifies each market once based on observability of the outcome, then caches the result forever. Calibration:

asPrior	Type	Examples
0.05	Truly unknowable	Banknote animal design, papal nationality
0.15	Noisy long-horizon	Far-future BTC, multi-year GDP
0.30	Expertise helps	Policy outcomes, geopolitics
0.50	Insider risk	Near-term policy, pre-announcement leaks
0.65	Significant asymmetry	Economic data releases, earnings
0.80	High informed flow	Day-of weather, live sports

Micro overlay on the top slice

For the rows returned by a scan (at most limit, capped at 200), the handler enriches the score with live microstructure signals from market_regime_snapshots: spread percentile, depth change, volume z-score, flow imbalance. Rows that got the overlay are tagged source: "classifier+micro"; rows that didn't stay at "classifier".

The universe pass uses the prior only — cheap and covers 100% of the ~50K market universe. The overlay uses the cache only for the sliced result set, so the cost is bounded at ≤200 row lookups per scan.

Endpoints

GET /api/public/regime/scan — scan the universe, filter by label / score range / venue / event_type, sort by score / edge / spread / volume.
GET /api/public/market-microstructure-history?ticker=&days=7 — spread + depth time series for one ticker from orderbook_snapshots. Replaces the old /regime/history endpoint (deprecated to 410 — the score itself is a flat line because the prior is static).

Coverage of the prior grows organically: a backfill classifier covers the top-5K active markets by volume, and the scan-regime cron (6h cadence) picks up new markets as they enter the universe. Unclassified markets return source: "neutral-default" — the scorer's no-data branch, score ≈ 0.35.

Signals

Events that feed into evaluations. Five types:

Type	Source	Description
news	Heartbeat / manual	News articles, data releases
price_move	Heartbeat	Market price change ≥ 3 cents
user_note	Manual	Your analysis or observations
external	Manual	Signals from other systems
upcoming_event	Heartbeat	Kalshi milestone matching edges

Kill Conditions

Before every evaluation, the system asks: "Does any event fundamentally break a core assumption of this thesis?" If yes, it flags the threat prominently before any other analysis. News scans include adversarial queries that actively seek contradictory evidence. The system tries to kill your thesis before you trade on it.

Track Record

Feedback loop that computes how well past edges predicted market movement:

Hit rate: % of edges where market moved toward the thesis-implied price
Average movement: mean price change in cents since edge detection
Track record is injected into evaluation prompts so the system learns from its accuracy

Tree Augmentation

The causal tree evolves over time:

Each evaluation can suggest new causal factors
Weekly, the augment agent reviews suggestions
LLM decides which to accept (must be genuinely new, not duplicates)
Accepted nodes are appended (never removed — append-only tree)
Importance weights are rebalanced among siblings

Intent Lifecycle

Intents are declarative execution instructions. Instead of "buy now," you say "buy when price drops below 40c, but only if oil is above $95."

pending → armed → triggered → filled ↓ [soft condition?] ↓ ↓ PASS HOLD ↓ ↓ execute wait

Trigger types: immediate, price_below, price_above, time. Soft conditions are natural language, evaluated by LLM in --smart mode.