How to Get Cited by ChatGPT, Perplexity, and Google AI Mode
Getting cited by AI search engines is not luck — it follows a repeatable pattern. Here are the seven signals that drive AI citations and how to optimise for each one.

Every week, thousands of potential customers ask ChatGPT or Perplexity which tool to use, which agency to hire, or which approach to take. The brand that gets cited in those answers wins the deal — often before any human ever visits a website.
STAT: 46% of consumers now use AI chatbots for product research before making a purchase. Source: Adobe Analytics, 2025
The good news: getting cited is not random. It follows a pattern.
How citation selection works
Answer engines retrieve candidate content in two ways. First, training data — content included in the model's training corpus. Second, live retrieval — real-time web search that fetches and synthesises current sources. Perplexity and Google AI Mode lean heavily on live retrieval. ChatGPT's web-browsing mode does too.
QUOTE: "We prioritise sources that directly answer the question with verifiable, specific information. Vague, keyword-stuffed pages never make it into citations." — Anonymous, Perplexity engineering blog, 2024
Signal 1: Structured, scannable content
Language models extract answers from text. If your answer is buried in a 3,000-word essay with no headings or explicit Q&A structure, the model may miss it entirely.
Write FAQ sections with explicit questions as headings and direct answers in the first sentence. Use H2/H3 headings that match the phrases your buyers actually type.
TAKEAWAY: The single fastest win is adding a structured FAQ section to your 5 most-visited pages. It takes 2 hours and produces measurable AEO improvement within 6 weeks.
Signal 2: Schema markup
JSON-LD schema tells crawlers what type of thing your content is about. The highest-value schema types for AEO are Organization, FAQPage, HowTo, Article, and SoftwareApplication. A site without any structured data is leaving attribution on the table.
STAT: Pages with FAQPage schema appear in AI Overviews 2.8× more frequently than equivalent pages without it. Source: Semrush AI Search Study, 2025
Signal 3: Off-site brand mentions
LLMs are trained on human conversation — Reddit threads, HN discussions, news articles, Wikipedia. If your brand is regularly mentioned in those places in a positive context, the model develops a stronger prior that you are a legitimate source.
| Platform | Training weight | Live retrieval weight |
|---|---|---|
| Wikipedia | Very high | High |
| Reddit / HN | High | Medium |
| TechCrunch / Forbes | High | High |
| G2 / Capterra | Medium | Medium |
| Your own blog | Low | High (if structured) |
Signal 4: A crawlable, fast site
Check your robots.txt against known AI crawler user agents: GPTBot, PerplexityBot, ClaudeBot, Googlebot-Extended. Blocking any of these is the single most common fixable reason for zero citations.
RESEARCH: In a crawl audit of 500 B2B SaaS websites, 23% were inadvertently blocking at least one major AI crawler via robots.txt. Source: CiteAgentic Audit Data, 2025
Signal 5: llms.txt
The llms.txt standard lets you publish a machine-readable index of your site's content at /llms.txt. It tells LLM crawlers which pages are highest priority. Adoption is growing and the cost of adding it is trivial.
Signal 6: Topical depth
A site with 5 deep, well-structured articles on a focused topic will out-cite a site with 50 shallow articles across a dozen topics. Answer engines infer authority from topical consistency and depth.
Signal 7: Recency
STAT: AI Overviews cite content published or updated within the last 12 months at a rate 2.4× higher than content older than 2 years. Source: BrightEdge Research, 2025
Refresh your high-value pages at least annually. Update statistics, add new sections, and re-publish with an updated dateModified in your schema.
TAKEAWAY: Treat your top 10 content pages as living documents. A 45-minute quarterly refresh keeps them citation-eligible as AI search models update their retrieval signals.
FAQ
Which AI engine is easiest to get cited in?
Perplexity is generally the most accessible starting point because it relies almost entirely on live web retrieval — meaning fresh, well-structured content can get cited within days of publication.
Does having more backlinks help AEO?
Indirectly. Backlinks drive SEO rankings, which increases the probability that your pages get retrieved by live-retrieval engines. But for AEO specifically, off-site brand mentions (Reddit, press, review platforms) are more directly correlated with citation rates than backlink count.
How many pages do I need to start seeing citations?
Quality beats quantity. A single well-structured page with FAQPage schema, a clear answer in the first paragraph, and relevant H2 headings can get cited on day one. The question is which prompts it gets cited for.
References
- 1Adobe Analytics, "AI in the Consumer Journey", 2025. https://business.adobe.com/
- 2Semrush, "AI Search Visibility Study", 2025. https://www.semrush.com/
- 3BrightEdge, "AI-Powered Search Signals Report", 2025. https://www.brightedge.com/
- 4CiteAgentic, "AEO Technical Crawl Audit: 500 B2B SaaS Sites", 2025. https://www.citeagentic.com/