SimpleFunctions
OPINIONS/ANALYSIS·6 min read

Search Attention Is the Last Closed Macro

Eight and a half billion searches a day on Google, none of it visible. Why tech people should care about democratizing public search attention — and what the join actually looks like.

By Patrick LiuApril 26, 2026

Search Attention Is the Last Closed Macro

Eight and a half billion searches a day on Google. Two billion on Bing. A few hundred million on DuckDuckGo. None of it is visible to anyone outside the buildings where the queries land.

That is unusual.

Stock prices print every microsecond, on a public tape, for free. Bond yields update every fifteen seconds. Even Federal Reserve probability surfaces — once locked inside SOMA — get scraped, repriced, and rebroadcast by traders within minutes. Macro flows are a competitive market in their own right; everyone watches.

Search attention is the one macro signal that didn't make the leap. It is, in 2026, the closed exception in an otherwise transparent stack.

What "transparency" actually buys us today

Google Trends RSS exists. It serves about ten queries at a time, with approx_traffic strings like "500000+", a handful of news items per query, and a pubDate that lags the real spike by hours. Across geographies the numbers shrink — ten items per non-US locale on a good day. There is no velocity, no entity disambiguation, no link to anything else the world has ever recorded. It is what a generous PR team might call a "digest." It is not real-time visibility into what eight and a half billion daily decisions look like.

Wikipedia does better. The pageviews API is published with a one-day lag and exposes the top one thousand articles per day per language, every entity resolved to a page title and ultimately a Wikidata QID. It is a closer substitute for true attention monitoring than anything Google ships publicly. That fact alone is telling: the encyclopedia is more legible than the search engine.

Reddit, Hacker News, GDELT — each of them publishes structured firehoses of where their respective tribes are looking. None of these alone tells you what the world is asking. Combined, with entity resolution and a reasonable filter on noise, they get you within striking distance.

So the raw materials exist. They are not joined.

Why tech people specifically should care

Three reasons that are unsentimental.

One — agents need real-time attention to be useful. Half the value of a research agent is that it can tell you "this thing is happening right now, here's the structured background, here's the catalyst window." If your agent is reading a stale RSS digest, it cannot do that. The agent is doing the same thing the user already did — opening the news app — just slower and more expensively.

Two — models trained on web data are increasingly downstream of attention dynamics. When you fine-tune on a Common Crawl snapshot, you are inheriting whatever the median page on the open web looked like at scrape time. The pages that people are actually reading right now, this hour, are not in your weights. The gap between what a model saw and what people are doing widens every day. If you cannot observe attention as a stream, you cannot tell whether your model is drifting from the world or whether the world is drifting from your model. Both happen, and they require different fixes.

Three — product decisions get made on instinct. "Are people searching for our category more or less since launch?" is a question that almost every product team asks. The honest answer in 2026 is: nobody knows, because the only people with the firehose don't share it, and the proxies are too lagged or too narrow to triangulate. So teams default to A/B testing the inputs they can see and ignoring the ambient signal entirely. That is a fine local optimum and a terrible global one.

What "democratizing" should actually mean

Not another scraper. Scraping is the wrong frame because the data sources that already publish are sufficient — they just don't compose. Democratizing search attention means three concrete things:

Multi-source aggregation, where every observation is keyed to a canonical entity. When Google Trends surfaces "coco gauff" and Wikipedia pageviews surfaces "Coco_Gauff" and Reddit r/all surfaces "Coco Gauff Madrid Open" and Polymarket has a contract titled "Madrid Open: Solana Sierra vs Coco Gauff," all four of those need to resolve to one thing — wikidata:Q39263365 — so a downstream consumer can ask one query and get four signals back. Wikidata's wbsearchentities API does this canonicalization for free; the only reason it is not standard practice is that nobody has assembled the join.

Velocity over absolutes. Wikipedia's top-1000 today includes pages that get 100K+ daily views every single day — featured articles, evergreen reference pages. Those are not "trending"; they are baseline. The signal of interest is today / yesterday ratio, or today / 7-day-average, or any two-point comparison that filters out the steady state. Most aggregators don't bother. The ones that do find that genuine surge events look very different from what raw rank tables suggest.

Joins to calibrated probabilities, where they exist. Public attention is one signal. Market consensus is another. They occasionally align (the World Cup) and occasionally diverge wildly (a celebrity death, where attention spikes but markets don't move because the event has already resolved). The interesting reading is the alignment status itself. A query that's surging across Trends and Wikipedia and has no associated prediction market is a gap — a market that doesn't exist yet. A query that's surging and has a market trading at a stable price is a consensus — the market knew. A query that's surging and has a market price that's moving violently is a signal — the market is repricing in real time. These are three different things and they require three different responses, and you cannot tell them apart without the join.

The asymmetry is the product opportunity

Right now, the firehose lives at Google, in private alerting tools at consultancies that charge five figures a month, and in the heads of a few sell-side analysts who pattern-match for a living. Each of those audiences has the same nominal data source — public search behavior — but accesses it through layers of contracts, NDAs, and tribal knowledge.

The information itself is not proprietary. The composition is. Anybody can read Wikipedia's pageviews API. Anybody can poll Google Trends RSS. Anybody can call Wikidata. The work is in joining them, filtering noise, layering market context, and serving the result somewhere a person or an agent can see it. That work has not been done at the scale it should have been by now, and that is what democratizing search attention means in practice.

It is also, incidentally, why this is worth building. The closed macro signals are getting opened slowly, one composition at a time. Tech people who care about understanding the world should pay attention to which of these compositions exist, which don't, and which they could build themselves. The ambient stream of what humanity is asking is too important to leave to one company's RSS digest.

We open-sourced a first cut. It joins Google Trends US, Wikipedia velocity, and Wikidata canonicalization, layered with live Kalshi and Polymarket prices through a relevance gate that filters keyword noise. It refreshes daily and serves both a calibrated layer (entities with live markets) and an exploratory layer (entities without). The whole pipeline costs around two dollars a month to run.

It is intentionally a small thing. The bigger point is the precedent: search attention should be readable, joinable, and free, and any sufficiently motivated tech team can wire that up in an afternoon. The only reason it has not happened more often is that nobody bothered to take the join seriously.

That is what we mean by democratizing it.

search-attentionopen-datatech-criticismprediction-marketsinformation-asymmetry
Engine-written disclosureLast fact-check: Apr 27, 2026

This article was primarily written by the SimpleFunctions engine and does not represent the views of the company.

Updates since publish
  • UPDATED: Google processes ~16.4 billion searches/day (not 8.5B as published)
  • UPDATED: Bing handles ~1.2 billion searches/day (not 2B as published)