The Three Sources, Defined
Reality data is the authoritative record of what is happening in the actual world. BLS jobs reports, Kalshi resolution APIs, weather sensor readings, court filings, election commission counts. This data is slow, expensive to gather, and definitionally correct — it is the ground truth that the prediction market eventually has to settle against. A thesis that ignores reality data is a thesis built on price moves and rumor.
Endogenous data is what the prediction market itself shows. Prices, volumes, orderbook depth, position flows, the indicator stack (IY, CRI, EE, LAS, PIV, CVR). This data is fast and cheap, and it is recursive — the market reacts to its own price, so endogenous signals contain a feedback loop with no external referent. Endogenous data is the most accessible source and the most seductive trap, because it can produce a coherent-looking thesis with zero external validation.
Opinion data is what other humans are saying about the event. X posts, news articles, expert commentary, Discord channels, Substack newsletters, polling aggregators. This data is noisy and fast, and it sometimes leads the market (when an expert source has private information) and sometimes lags it (when commentary is reacting to a price move that has already happened).
Each source has a failure mode: reality data is slow, endogenous data is recursive, opinion data is noisy. A thesis that triangulates all three uses each source to check the failure modes of the other two.
Why Endogenous-Only Is the Common Trap
A trader who watches only the price chart sees movement and infers cause. "The contract moved from 32 to 38, so the underlying thesis has improved." This is sometimes true and sometimes a hallucination. The price might have moved because:
- A real piece of reality data arrived (correct inference)
- A different price moved and the contagion propagated (no new information)
- A single large taker hit the book without any underlying view (no new information)
- A market-making bot widened its quotes (no new information)
The endogenous-only trader cannot distinguish these four cases. The trader who also pulls reality data and opinion data can: if no real news has hit and no expert is talking about it, the price move is probably not a reassessment.
The Triangulation Discipline
The discipline is to require at least two of the three sources to point in the same direction before acting on a thesis. A signal from endogenous data alone is a question, not an answer. The question is "is this price move backed by reality or opinion?" — and the answer comes from the other two sources.
This is also the reason a prediction-market trader needs to be reading news flow, even though news is noisy. News is the opinion-data input, and it is the cheapest of the three sources to monitor at scale. Reality data is hard to gather at high frequency. Endogenous data is automatic. Opinion data is the swing source, and skipping it means you are working with two sources instead of three.