Generative Engine Optimization: how to get cited by ChatGPT, Perplexity and AI Overviews

Ranking #1 is worth less when no one sees the links. The new game is being the source an AI quotes — and there's now peer-reviewed research on how to win it.

10 min read Updated


For twenty years, the goal of being found online had one shape: rank a blue link as high as possible and earn the click. That model is quietly breaking. Increasingly, people ask a question and read a synthesized answer — assembled by a language model from a handful of sources — without ever clicking through to any of them.

If your content isn’t one of those sources, you’re invisible in the place where the decision now happens. Generative Engine Optimization (GEO) is the discipline of fixing that: making your pages the ones an AI engine quotes and credits. This is a practical, data-backed guide to how it works and what to actually do.

The audience already moved

This isn’t a forecast about something coming later. The behaviour has already shifted, and the numbers are not subtle.

ChatGPT alone reached 800 million weekly active users by October 2025 and crossed 900 million in early 2026 — roughly tripling in about a year. That’s a search-engine-scale audience asking questions in a box that returns answers, not lists of links.

Weekly active users
Weekly active users, from OpenAI's own disclosures (Sam Altman / DevDay). The audience for answer-style search is already at search-engine scale — and still climbing. Scrub across it.

Three more data points show the same current pulling in one direction:

  • Gartner projects traditional search engine volume will drop 25% by 2026 as AI chatbots and virtual agents absorb queries that used to start in a search box.
  • Pew Research instrumented ~69,000 real Google searches in 2025: users clicked a traditional result on just 8% of searches that displayed an AI summary, versus 15% without one — and clicked a link inside the summary only 1% of the time. (Google disputes the methodology; the direction is consistent with everything else in the market.)
  • On the demand side, Adobe Analytics measured AI-referred traffic to U.S. retail sites up ~693% year-over-year over the 2025 holiday season, and those visitors converted 31% better than other sources — the citations that do carry a link are sending unusually high-intent traffic.

The takeaway isn’t “SEO is dead.” It’s that a growing share of attention now lands on a synthesized answer first. The link is no longer the destination — the answer is.

GEO vs SEO: what actually changed

It’s tempting to treat GEO as a rebrand of SEO. It isn’t, but they’re deeply related. The clearest way to hold both in your head:

Classic SEOGenerative Engine Optimization
You’re competing forA ranked link on a results pageA sentence inside the AI’s answer, with a citation
The “win”The user clicks your linkThe model quotes you and credits the source
Unit of valuePosition (rank #1–10)Inclusion + citation share in the answer
Who decidesA ranking algorithmA retrieval step and a language model
Failure modeYou rank #8, few clicksYou’re accurate but unquotable, so you’re skipped

Here’s the part that should reassure anyone who’s invested in SEO: the two share most of their foundation. AI engines don’t browse the open web from scratch for every question — they lean on the same search indexes you already optimize for. Semrush compared AI citations against Google’s top-10 organic results and found heavy overlap:

EngineDomain overlap with Google top 10URL overlap
Perplexity~91%~82%
Google AI Overviews~86%~67%
ChatGPT (web)Much lower — frequently cites pages ranking 21+

So strong technical SEO — crawlable, fast, authoritative, relevant — is the price of admission for most answer engines. GEO is the layer you add on top: once your page is in the candidate set, GEO is what makes the model choose it, quote it, and name it.

What the research actually says works

GEO could have stayed a folklore discipline of confident guesses. It didn’t, because there’s now peer-reviewed work on it. A Princeton-led team (Aggarwal et al.) published “GEO: Generative Engine Optimization” and presented it at KDD 2024 — testing content edits against a 10,000-query benchmark and measuring how visible each source became in the generated answer.

Their headline result: the right edits can boost a source’s visibility in AI answers by up to 40%. And crucially, the winning tactics were not keyword tricks. The highest-impact edits were:

TacticWhat it means in practice
Add statisticsReplace vague claims with specific, cited numbers (“converted 31% better,” not “converted better”)
Add quotationsInclude quotable lines from credible, named sources
Cite sourcesLink out to authoritative references — models favour content that is itself well-grounded
Clear, fluent languagePlain, well-structured prose the model can lift cleanly

Notice what’s missing: keyword stuffing and density tricks actively did not help. The research points the same way good editorial instincts always did — be specific, be credible, be quotable. AI engines are, in effect, rewarding the thing that was always supposed to win.

How an answer engine turns your page into a citation

To optimize for it, it helps to picture the pipeline. Every AI answer is built in roughly the same two-step shape: retrieve, then synthesize. First the engine pulls a set of candidate sources (usually from a search index); then a language model reads the top ones and writes a single answer, citing a few.

retrieve → synthesize → cite
Someone asks a question, in plain language
Engine retrieves ranks candidate sources
Your page structured · quotable · cited
LLM synthesizes writes one answer
Cited answer links a few sources
An AI answer is built in two moves: retrieve candidate sources, then synthesize one answer from the best few. GEO works on both — be retrievable (crawlable, relevant, authoritative) AND be the easiest source to quote once retrieved. The loop is the payoff: a citation sends referral traffic and builds the authority that wins the next answer.

GEO has to win at both stages. Lose at retrieval (slow, blocked, or irrelevant page) and nothing else matters — you’re never in the room. Win retrieval but lose synthesis (accurate but buried in fluff, no clear answer to lift) and the model quietly picks a competitor who said the same thing more cleanly. The loop on the right is why this compounds: a citation drives high-intent referral traffic and the brand authority that makes you more likely to be retrieved and cited next time.

The GEO playbook

Here’s the concrete work, in rough priority order.

1. Answer the question directly, early

Models lift self-contained answers. Lead a section with a one- or two-sentence direct answer, then expand. Front-load a TL;DR (like the one at the top of this page). Use clear, literal headings that match how people actually ask — “How do I measure GEO?” beats “Measurement philosophy.”

2. Ship structured data

Mark up your content with schema.org JSON-LD so engines don’t have to guess what your page is. Article/BlogPosting establishes authorship and dates; FAQPage exposes question-answer pairs as machine-readable structure that’s ideal for an answer engine to ingest. We practice this here — this very page emits BlogPosting, BreadcrumbList, and FAQPage JSON-LD, generated from the same FAQ you can read at the bottom, so the visible content and the structured data never drift apart.

faq-jsonld.json
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is Generative Engine Optimization?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "GEO is the practice of structuring content so AI answer engines quote and cite it when they synthesize answers."
      }
    }
  ]
}

3. Be quotable — and verifiable

This is where the Princeton research cashes out. Use specific numbers, attribute them to named sources, add dates, and write in clean declarative sentences. A model deciding between two pages will quote the one that hands it a defensible, self-contained fact. Vague authority loses to specific evidence.

4. Stay fast and crawlable

You can’t be cited if you can’t be fetched. Server-render content so it exists in the HTML (not only after JavaScript runs), keep the page fast, and don’t block AI crawlers you actually want citations from in robots.txt. This is the same performance-first foundation we build every site on — see our write-up on Core Web Vitals, which doubles as GEO groundwork.

5. Earn third-party mentions

AI engines disproportionately cite community and reference sites: Semrush found Reddit and Quora among the most-cited domains across AI Overviews, ChatGPT, and Perplexity. Your own pages matter, but so does being discussed, reviewed, and referenced elsewhere. Digital PR, genuine community presence, and getting listed in credible roundups feed the same models.

6. Add an llms.txt — with realistic expectations

llms.txt is a proposed root-level file (think robots.txt for language models, proposed by Answer.AI in September 2024) that points LLMs at your most important pages in clean Markdown. It’s cheap to add and harmless. Be honest about it, though: as of mid-2026 the major AI crawlers have not confirmed they use it for ranking or citation. Add it as forward-looking hygiene, not as a lever that moves results today.

How to measure it

GEO is newer than its measurement tools, so expect to triangulate rather than read one clean dashboard:

  • AI referral traffic — segment ChatGPT, Perplexity, Gemini, and Copilot referrers in your analytics. This is your hardest signal that citations are converting to visits.
  • Citation share — for your priority prompts, check how often you’re cited versus competitors, using an AI-visibility tracker or a disciplined manual prompt-test routine.
  • Mentions without links — track brand mentions even when there’s no citation link. Semrush found roughly 62% of AI citations don’t surface a visible brand mention, so traffic alone badly undercounts your real influence.

The honest caveat

GEO is young. The engines change their behaviour without notice, the public datasets are early, and there is a thriving industry of people selling certainty they don’t have. Be skeptical of anyone promising guaranteed citations or a secret trick.

But the throughline is reassuring: nearly everything that wins in GEO — clarity, specificity, credible sourcing, structure, speed, and genuine authority — is the same thing that has always made content good. The interface changed. The fundamentals didn’t. Build pages worth quoting, make them trivial to retrieve and parse, and you’ll be in the answer whether the user types it into Google or asks it out loud.

References

Frequently asked questions

What is Generative Engine Optimization (GEO)?

GEO is the practice of structuring your content so that AI answer engines — ChatGPT, Perplexity, Google's AI Overviews and AI Mode, Gemini, Copilot — quote and cite it when they synthesize answers. Traditional SEO competes for a ranked link a human clicks; GEO competes to be the source an AI repeats inside its answer.

Is GEO different from SEO, or does it replace it?

It extends SEO, it doesn't replace it. AI engines still lean heavily on the classic search index: Semrush found Google AI Overviews share roughly 86% of their cited domains with Google's top-10 organic results, and Perplexity over 91%. So crawlability, authority and relevance still matter. GEO adds a layer on top: being quotable, well-structured, and backed by verifiable facts so the model picks your page out of the candidates it already retrieved.

What is llms.txt and should I add one?

llms.txt is a proposed plain-text/Markdown file at your site root (proposed by Jeremy Howard of Answer.AI in September 2024) that gives LLMs a curated map of your most important pages. It's cheap to add and forward-looking, but as of mid-2026 the major AI crawlers have not confirmed they use it for ranking or citation. Treat it as low-cost insurance, not a lever that moves results today.

How do I measure whether GEO is working?

Track three things: referral traffic from AI sources (ChatGPT, Perplexity, Gemini, Copilot) in your analytics; how often your brand and URLs are cited in answers to your target prompts (via AI-visibility tracking tools or manual prompt checks); and brand mentions even when there's no link — Semrush found about 62% of AI citations don't produce a visible brand mention, so traffic alone undercounts your impact.

Which AI engines should I optimize for first?

Start with the ones your audience already uses and the ones that actually send traffic: ChatGPT (800M+ weekly users), Google's AI Overviews and AI Mode (built on the index you likely already rank in), and Perplexity (the most citation-and-link-friendly of the answer engines). Because they all reuse the classic search index heavily, solid technical SEO covers most of the groundwork for all of them at once.


Want this working on your site?

We build fast, structured, answer-engine-ready web platforms. Tell us about your project — no commitment.

Get a quote