Everyone doing SEO knows the recipe. It hasn't changed in twenty years.

In 2005, competitive analysis meant keyword density. Pull up the top 10 Google results. Run them through a tool like GRKda. See that the top pages used your keyword at 2.3% density. Optimize your page to match. Watch it climb.

By 2015, the variables had multiplied. Backlink profiles, domain authority, content length, heading structure. By 2026, a single content brief analyzes 50+ variables from the top-ranking competitors: semantic keywords, word count ranges, heading hierarchies, "People Also Ask" data, content gaps, search intent classification, topic saturation scores.

The tools got more sophisticated. But the recipe never changed. Look at what's winning. Figure out what those pages have in common. Build something that matches or exceeds those patterns.

The channel was Google search then. The channel is AI systems now. Same recipe.

Why does this matter now?

Google's AI Overviews now appear on 25-50% of searches, depending on the query type and who's measuring. Google's AI Mode passed 1 billion monthly users within a year of launch. ChatGPT has 900 million weekly active users. Perplexity has 45 million monthly active users and growing. Claude and Gemini are becoming default research tools for millions more.

These systems don't show ten blue links. They synthesize an answer from multiple sources and cite where they got it. The question isn't "Do I rank?" anymore. It's "Am I cited?"

The question isn't "Do I rank?" anymore. It's "Am I cited?"

This practice is called Generative Engine Optimization, or GEO. The competitive analysis methodology is familiar: analyze the sources getting cited, reverse-engineer what they have in common, replicate the pattern.

Most of what makes content rank well also makes it citable. But understanding how AI systems find and select sources changes which optimizations matter most.

How do AI systems actually find what to cite?

Most GEO articles skip this question entirely. They describe what to optimize without explaining the retrieval mechanism that determines what gets cited and what gets ignored. The mechanism has a name: Query Fan Out.

AI systems are not search engines. They don't maintain their own web indexes or run their own crawlers. When ChatGPT, Claude, Perplexity, or Google AI Mode needs to answer a question, the system decomposes the prompt into multiple focused sub-queries. For real-time or research queries, those sub-queries run against external search engines (Bing in ChatGPT's case, Google for AI Mode) and the retrieved results are synthesized into the response. Even systems with large training corpora reach for live search when recency or specificity matters.

What does Query Fan Out look like in practice?

Google's own AI Mode documentation describes the process directly: the system "divides your question into subtopics and searches for each one simultaneously across multiple data sources." Google patents detail the mechanism further: generating "synthetic search queries" executed "in parallel or in series" as part of a generative search workflow.

Consider what happens when someone asks an AI system "best marketing platform for small teams." It doesn't search that phrase and summarize the results. It generates sub-queries:

"best marketing platforms small business 2026"
"marketing tools for small teams comparison"
"all-in-one marketing platform reviews"
"marketing platform pricing comparison"
"alternatives to HubSpot for small teams"

Each sub-query hits Google or Bing independently. The AI retrieves results from all of them, then synthesizes the response.

Why does this change everything?

Three data points reshape GEO strategy entirely.

First: 95% of ChatGPT fan-out sub-queries have zero traditional search volume. No human types them. They're machine-generated decompositions that don't appear in any keyword research tool.

Second: 32.9% of pages cited by ChatGPT appeared only in fan-out sub-query results, not in the original prompt's search results. Nearly a third of all citations come from content that wouldn't be found by optimizing for the user's prompt alone.

Third: 87% of SearchGPT citations match Bing's top organic results, while only 56% match Google's. For ChatGPT specifically, Bing rankings matter more than Google rankings. Most GEO strategies are optimizing for the wrong search engine.

Query Fan Out is the mechanism behind everything that follows.

They don't search the way a person does. They split one question into many, and cite whatever wins each one.

What is the CITED Framework?

The CITED Framework is a structured approach to Generative Engine Optimization, organized around the Query Fan Out retrieval mechanism. It covers five pillars:

Pillar	Question it answers	How it connects to QFO
Crawl	Can AI systems find and retrieve your content?	Pages must be indexable by Google and Bing, where sub-queries run
Inform	Is your content structured for easy extraction?	Structure determines what survives the synthesis step after retrieval
Trust	Do AI systems consider you authoritative enough to cite?	Authority determines ranking position in sub-query results
Evaluate	Are you tracking your AI citations?	Monitoring should cover sub-query visibility, not just head terms
Distribute	Are you present where AI systems actually look?	Multiple surfaces capture different sub-queries from the same fan-out

Most GEO articles stop at the first three pillars (Crawl, Inform, Trust). They're necessary. They're not sufficient. The first three pillars determine whether your content deserves to be cited once found. Evaluate tells you whether it's working. Distribute determines whether AI systems encounter you across enough sub-queries in the first place.

That last pillar is the one most articles on the topic skip entirely. And the data says it's the most important one.

Most frameworks stop at the first three pillars. The data says the one they skip matters most.

How do GEO and SEO overlap?

Compare the factors that help content rank on Google with the factors that help content get cited by AI, and the overlap is significant. The core fundamentals carry over. But each discipline has its own priorities that the other doesn't share.

Shared foundations (these matter equally for both)

Factor	SEO	GEO
Title optimization	High	High
Content depth	High	High
Heading structure (H2/H3)	High	High
Internal linking	High	High
Domain authority	Critical	High
Canonical tags	Baseline	Baseline

SEO-specific (these don't affect AI citation)

Factor	Importance
Mobile optimization	Critical
Backlink acquisition	Critical
URL structure	High
Meta descriptions (CTR)	High
Image alt text	High
Open Graph / Twitter Cards	High

GEO-specific (these don't affect Google ranking)

Factor	Importance
Third-party distribution	Critical
AI crawler access (robots.txt)	Critical
Schema.org structured data	Critical
Server-side rendering	Critical
Answer-first paragraphs	Critical
Section length (75-150 words)	Critical
Named entity density	High
Question-format headings	High
FAQ schema	High
Comparison tables	High

The shared foundations are real. But SEO has its own priorities (mobile, backlinks, click-through optimization) and GEO has its own (structure, entities, distribution). Neither is a subset of the other.

If your SEO is solid, you're not starting over. You're adding a second checklist.

GEO isn't a separate discipline. It's the same fundamentals, with a new set of priorities layered on top.

What signals separate ranking from citation?

Five signals consistently appear in cited content that don't show up in traditional SEO analysis. These aren't arbitrary preferences. They determine what survives the synthesis step after QFO retrieval has already found the content. Crawl and Distribute get content into the retrieval pool. These signals determine whether it's extractable once there.

1. Section word count

AI retrieval systems chunk content at heading boundaries, and research shows sections in the 75-150 word range (100-200 tokens) are optimal for extraction as citations. Too long and the system can't isolate the relevant passage. Too short and there isn't enough context for a standalone citation. SEO doesn't care about section length. AI citation does.

2. Question-format headings

Cited pages use H2 headings phrased as questions more often than top-ranking pages do. "What is keyword research?" instead of "Keyword Research." This maps directly to how QFO generates sub-queries: when an AI system decomposes a prompt, those sub-queries often take question form. Headings that match the sub-query get retrieved.

3. FAQ schema

Pages with FAQPage structured data get cited at higher rates than equivalent content without it. The FAQ format gives AI systems pre-structured question-answer pairs. Each FAQ entry functions as an independently retrievable unit that can match against a specific sub-query.

4. Comparison tables

Pages with HTML tables comparing options, features, or approaches are preferred by AI systems that need to synthesize structured comparisons. Tables are extractable in ways that prose paragraphs are not. When a sub-query asks for a comparison, tabular content wins.

5. Named entity density

Cited pages contain more specific, named references: statistics with sources, brand names, named frameworks, concrete claims. Content with 15 or more connected named entities shows 4.8x higher selection probability by AI engines. Entities help the system match content to specific sub-queries. Vague claims don't get retrieved. Specific ones do.

Why do comprehensive pages outperform narrow ones?

Query Fan Out explains something that otherwise seems counterintuitive. A single comprehensive page covering multiple related intents consistently outperforms multiple narrow pages targeting one intent each. The reason: a comprehensive page gets retrieved for more sub-queries from the same fan-out. Each well-structured section acts as an independently citable unit matched against a different sub-query. One page, multiple retrieval paths.

Semrush's QFO experiment demonstrated this directly: optimizing content to address fan-out sub-queries produced a 150% increase in AI citations across test articles.

These signals aren't replacing the SEO checklist. They're additions. New variables in the same competitive analysis that's been running for years.

Each well-structured section is an independently citable unit. One page, many retrieval paths.

Why is distribution the highest-impact pillar?

82% of AI citations come from earned media, not the brand's own website. That number, from Muck Rack's analysis of over one million AI citations, changes everything about how to think about GEO.

You can optimize your own site perfectly. Clean structure, answer-first paragraphs, every Schema.org tag in place. And still barely get cited, because AI systems preferentially cite sources that aren't you.

Query Fan Out explains why distribution works. Each fan-out sub-query is an independent search. Your canonical website might rank for two of those sub-queries. A G2 review page might rank for another. A Reddit thread mentioning the product might rank for a fourth. A guest post on Search Engine Land might rank for a fifth. More surfaces mean more sub-queries covered. Distribution isn't just "be in more places." It's "appear in more sub-query results."

What does the research show?

A Stacker study of 87 stories across 30 clients, analyzing 2,600+ prompts across 8 AI platforms, found that earned media distribution produces a median 239% lift in AI citations compared to brand-owned content alone, with 64% of those citations coming from third-party publisher sources. According to AirOps' 2026 State of AI Search report, approximately 85% of brand mentions in AI-generated answers come from external, third-party domains rather than brand-owned pages.

Where does each AI platform look?

Each AI platform cites radically different sources. According to Profound, only 11% of domains get cited by both ChatGPT and Perplexity. A single-platform strategy misses most of the opportunity.

Platform	Top Sources	What This Means
ChatGPT	Wikipedia (7.8%), Reddit (1.8%), Forbes (1.1%), G2 (1.1%)	Get listed on G2 and industry directories
Perplexity	Reddit (6.6%), YouTube (2%), Gartner (1%), Yelp (0.8%)	Engage authentically on Reddit, create YouTube content
Google AI Mode	Reddit (2.2%), YouTube (1.9%), Quora (1.5%), LinkedIn (1.3%)	Publish LinkedIn articles, create video with transcripts

What does the distribution playbook look like?

The playbook isn't complicated. It's just different from SEO link-building:

Get listed on directories. G2, Capterra, Product Hunt. Each listing is an additional surface for appearing in fan-out sub-queries.
Be present where AI looks. Reddit, YouTube, LinkedIn articles. Not as marketing. As genuine expert participation.
Earn third-party mentions. Guest posts, expert commentary, industry roundups. These carry more citation weight than anything on your own domain.
Stay consistent. AI systems scan for agreement across independent sources. If your positioning on G2 contradicts your website, AI systems flag the inconsistency and may exclude you entirely.

SEO had link-building. GEO has earned distribution. The mechanism is different, but the principle is the same: external validation matters more than self-promotion.

What about infrastructure-level distribution?

Beyond earned media, there's a smaller class of distribution tactic that works at the infrastructure level rather than the content level: depositing content on scholarly repositories like Zenodo, attaching DOIs and BibTeX citation files, and using ScholarlyArticle schema. These attach academic-publishing infrastructure to commercial content. That puts it inside retrieval pipelines (DataCite, OpenAIRE, Google Scholar) that AI training corpora and live retrievers already query.

It's a smaller-volume play than the earned-media playbook above, and the evidence base is still observational. We unify these tactics into a formal framework called Academic Citation Infrastructure (ACI) in a companion methods paper that documents the pattern, analyzes the retrieval mechanism, and proposes a controlled experiment for validation.

The highest-impact lever in GEO isn't on your website at all.

How far behind is AI citation measurement?

Crawl, Inform, Trust, and Distribute all map cleanly to traditional SEO concepts. Evaluate is the pillar where most businesses are furthest behind.

What tools exist today?

SEO has had Google Search Console for nearly two decades. For AI citations, the tooling exists but adoption is still in its infancy. Most businesses don't know if ChatGPT recommends their product, if Perplexity cites their content, or if Google's AI Overview is hallucinating outdated information about their brand.

Google Search Console is starting to show AI Overview impressions. Platforms like Profound, Otterly, and ILLIXIS track AI citations across multiple platforms. But the ecosystem is still where SEO analytics were in the mid-2000s: fragmented and unfamiliar to most teams. Even manual spot-checks (asking ChatGPT and Perplexity about your brand, recording what they say) puts you ahead of most competitors.

How volatile are AI citations?

Research analyzing roughly 80,000 prompts per AI platform found that 40-60% of cited domains change within a single month for identical queries. Losing a citation doesn't mean permanent failure. It means the system is rotating sources, and staying in the rotation requires fresh content and consistent presence.

How much does freshness matter?

AI systems show strong preference for recent content, citing URLs that are roughly 400 days newer than organic Google results. Content updated every 90-120 days maintains significantly higher visibility in AI-powered search compared to static content. The Evaluate pillar isn't a one-time check. It's an ongoing practice of tracking, spotting gaps, and iterating.

AI citation measurement is where SEO analytics were in the mid-2000s. Starting now compounds into a lead.

How do you get started?

If you're already doing SEO, here's how to layer GEO on top.

Week 1: Secure the foundation (things you probably already have)

Verify canonical tags, title optimization, content depth, internal linking. If your SEO audit is clean, skip this.
Check your robots.txt for AI crawlers. Unblock GPTBot, ClaudeBot, PerplexityBot if blocked.
Submit your sitemap to Bing Webmaster Tools. ChatGPT's browsing uses Bing's index, and 87% of SearchGPT citations match Bing's top results. Bing optimization is now a GEO priority, not an afterthought.

Week 2-3: Add the five new signals

Add answer-first paragraphs to your top 5 pages. Put the conclusion under the heading, then explain.
Restructure sections to 75-150 words per heading. Add comparison tables where relevant.
Add Schema.org markup (Organization, Product/Service, FAQ) if missing. Add FAQ sections with question-format H2s.
Audit named entity density. Add specific statistics, named frameworks, and concrete data points.

Week 3-4: Distribute where AI looks

Create a G2 profile with a rich description. This is ChatGPT's top product source.
Publish a LinkedIn article linking back to your canonical content.
Start engaging on Reddit in your industry's subreddits. Expert answers, not marketing.
Audit your brand messaging across all platforms for consistency.

Week 4+: Start evaluating

Check what ChatGPT and Perplexity say about your brand. Record it.
Identify 3-5 queries where competitors get cited and you don't.
Create or retrofit content targeting those gaps.
Set a 90-day reminder to refresh all GEO-targeted content.

Same game. New surface.

SEO competitive analysis has been the same discipline for twenty years. Analyze what's winning. Reverse-engineer the pattern. Build something that matches it.

GEO adds a retrieval mechanism most people haven't studied. When someone asks an AI system a question, it doesn't search the web the way a human would. It decomposes the prompt into sub-queries and searches for each one independently. Understanding that mechanism is the difference between optimizing blindly and optimizing for the system that actually selects what gets cited.

Five new content signals separate citation from ranking. Distribution turns out to be the highest-impact lever, because more surfaces mean more sub-queries covered. And the measurement approach is still in its infancy, which means the people who start now have a compounding advantage.

The CITED Framework is the playbook for this new surface. Not because AI citation is some mysterious new discipline. Because it's the same competitive analysis game it's always been, with a retrieval mechanism most people haven't studied, new variables most teams haven't measured, and a distribution layer most businesses haven't built.

Check your robots.txt. Submit the sitemap to Bing. Write an answer-first paragraph. See if ChatGPT knows you exist.

The recipe hasn't changed. The surface has. The teams who figure that out first will own the citations while everyone else is still optimizing for the wrong channel.

Ready to operationalize? The CITED Framework reference defines all five pillars operationally, with the concrete signals and actions for each. From there, the ACI Playbook is the step-by-step template for the highest-leverage slice of CITED: the infrastructure-level distribution plays (methods-lite papers, ScholarlyArticle schema, Zenodo DOIs, citation files).

Written by Nuno Andrade, founder of ILLIXIS.

Don't Just Rank.
Get CITED.

Why does this matter now?

How do AI systems actually find what to cite?

What does Query Fan Out look like in practice?

Why does this change everything?

What is the CITED Framework?

How do GEO and SEO overlap?

What signals separate ranking from citation?

1. Section word count

2. Question-format headings

3. FAQ schema

4. Comparison tables

5. Named entity density

Why do comprehensive pages outperform narrow ones?

Why is distribution the highest-impact pillar?

What does the research show?

Where does each AI platform look?

What does the distribution playbook look like?

What about infrastructure-level distribution?

How far behind is AI citation measurement?

What tools exist today?

How volatile are AI citations?

How much does freshness matter?

How do you get started?

Same game. New surface.

You approve.
ILLIXIS executes.

Related Perspectives

How to Get Cited by AI: Academic Citation Infrastructure

Academic Citation Infrastructure: Infrastructure-Level Interventions for Generative Engine Optimization

Don't Just Rank.Get CITED.

Why does this matter now?

How do AI systems actually find what to cite?

What does Query Fan Out look like in practice?

Why does this change everything?

What is the CITED Framework?

How do GEO and SEO overlap?

What signals separate ranking from citation?

1. Section word count

2. Question-format headings

3. FAQ schema

4. Comparison tables

5. Named entity density

Why do comprehensive pages outperform narrow ones?

Why is distribution the highest-impact pillar?

What does the research show?

Where does each AI platform look?

What does the distribution playbook look like?

What about infrastructure-level distribution?

How far behind is AI citation measurement?

What tools exist today?

How volatile are AI citations?

How much does freshness matter?

How do you get started?

Same game. New surface.

You approve.ILLIXIS executes.

Related Perspectives

How to Get Cited by AI: Academic Citation Infrastructure

Academic Citation Infrastructure: Infrastructure-Level Interventions for Generative Engine Optimization

Don't Just Rank.
Get CITED.

You approve.
ILLIXIS executes.