Will Grok be the first to hit 1550 on Text Arena
Liquidity-weighted aggregate sits at 9% across 5 Kalshi contracts.
Implied probability
Kalshi
9%
5 contracts
Polymarket
—
not bound
Cross-venue gap
—
single venue
24h move
—
no pin
24h volume
$43
5 contracts
Closes
Jan 1, 2027
187 days
30-day trend
Bracket families
5 clusters across 5 contracts.
These contracts were grouped by title similarity. The headline aggregate combines all clusters; verify the cluster you actually need before quoting a number.
Cluster 1
Will Gemini be the first to hit 1550 on Text Arena
Will Gemini be the first to hit 1550 on Text Arena?: Gemini
KXMODELHIGH-27-1550-GEMI
Cluster 2
Will ChatGPT be the first to hit 1550 on Text Arena
Will ChatGPT be the first to hit 1550 on Text Arena?: ChatGPT
KXMODELHIGH-27-1550-CHAT
Cluster 3
Will Claude be the first to hit 1550 on Text Arena
Will Claude be the first to hit 1550 on Text Arena?: Claude
KXMODELHIGH-27-1550-CLAU
Cluster 4
Will Grok be the first to hit 1550 on Text Arena
Will Grok be the first to hit 1550 on Text Arena?: Grok
KXMODELHIGH-27-1550-GROK
Cluster 5
Will Other be the first to hit 1550 on Text Arena
Will Other be the first to hit 1550 on Text Arena?: Other
KXMODELHIGH-27-1550-OTHE
Analysis
This 17% probability reflects market expectations that Grok will be the first AI system to achieve a 1550 score on Text Arena in 2026. The low odds suggest skepticism about Grok's near-term performance relative to competitors like Claude and Google's AI models. Market participants appear to weight Claude's current capabilities more heavily, as evidenced by its 36% probability on a similar Kalshi contract. The outcome depends primarily on the relative speed at which each AI company improves their models and achieves benchmark scores this year. Resolution will occur when any AI system first reaches the 1550 threshold on Text Arena, which will likely happen gradually as models are updated and new versions released throughout 2026. The timing of major model releases and benchmark updates from Anthropic, Google, and OpenAI will be critical determinants.
- ›Grok's recent benchmark performance relative to Claude, Google Gemini, and OpenAI's latest models on similar evaluation metrics
- ›The frequency and magnitude of model updates from xAI versus competitor releases planned for 2026
- ›Current market participants are pricing Claude as 2.1x more likely to hit 1550 first, suggesting confidence in Anthropic's development trajectory
- ›The Text Arena benchmark's difficulty rating and whether intermediate scores suggest any model is approaching the 1550 threshold
- ›Trading volume and contract spreads suggest moderate uncertainty, with 51¢ on "None in 2026" indicating meaningful probability that no system reaches 1550 this year
What moved the line
- Jun 25Claude↓9pp26→17¢ · Kalshi
- Jun 26ChatGPT↓3pp10→7¢ · Kalshi
- Jun 27Claude↑3pp19→22¢ · Kalshi
Recently closed in general
- What will James Talarico say during 2026 Texas Democratic Convention - Friday General Sessionlast 87% · 1d
- Will Claire Valdez be victorious in the NY-07 Democratic primary AND Brad Lander be victorious in the NY-10 Democratic primary AND Darializa Avila Chevalier be defeated in the NY-13 Democratic primary for Sep 2026last 69% · 3d
- Will Darializa Avila Chevalier be victorious in the NY-13 Democratic primary AND Alex Bores be defeated in the NY-12 Democratic primary for Sep 2026last 37% · 4d
- Who will win the 2026 D.C. Democratic Mayoral Primarylast 97% · 9d
- Louisiana Democratic Senate Primary Winnerlast 89% · 9d
These markets stopped trading. Last odds and any captured outcome are shown above — full settlement detail lives at the venue.
More like this
Adjacent prediction questions.
In general
How we compute these odds
SimpleFunctions aggregates live prediction-market contracts from Kalshi and Polymarket. Each slug groups contracts that resolve on the same underlying event, identified by venue event_id.
For binary slugs, the headline probability is the liquidity-weighted mid-price across all bound contracts. For multi-outcome slugs (e.g. elections with 3+ candidates), the headline is the leader’s price; we never arithmetically average disjoint outcomes — that would produce a number with no real-world meaning.
Snapshots refresh every 5 minutes during market hours; daily aggregates are computed at 04:00 UTC. The 30-day sparkline is drawn from per-ticker daily means stored in market_indicator_daily; 24h delta and movement events are derived from the same source.
Last updated on this page: just now.