This post is inspired by (and loosely summarises) QueryBurst’s behind-the-scenes teardown of how ChatGPT assembles an answer.
[Queryburst - how ChatGPT works](https://queryburst.com/blog/how-chatgpt-works/)
Most people picture ChatGPT as one giant brain that “knows everything”. In reality, the modern experience is closer to a pipeline: smaller components decide whether to search, what to search, which pages to trust, which snippets matter, and only then does the big model show up to write the final response.
That distinction matters for brands, because it reframes “LLM visibility” as something you can actually engineer: get retrieved, get selected, get used.
A high-level view of how ChatGPT works
**1) A router decides if the web is needed**
For simple, timeless questions, a model can answer from training alone. For anything that’s current, specific, or comparison-heavy, systems increasingly lean on retrieval (searching and pulling in external context). OpenAI describes this pattern as retrieval-augmented generation (RAG): inject relevant external information at runtime to improve accuracy and freshness.
In the consumer product world, most commercial questions are implicitly retrieval-shaped:
“best running head torch for winter trails”
“is mineral sunscreen better for sensitive skin”
“S3 vs S1P safety boots difference”
These are exactly the kinds of prompts where external sources help.
**2) A planner generates searches (often more than one)**
When the system decides to search, it doesn’t just fire one neat query and call it a day. The planner can create multiple searches and variations designed to capture intent, then bring back a shortlist of candidates to read. This matches how ChatGPT Search is positioned: it can pull timely sources from the web and return answers with citations.
**3) Candidate pages are filtered before “deep reading”**
Before anything gets “read”, the pipeline typically filters using fast signals: titles, snippets, perceived relevance, and trust/authority cues. Only a subset gets fetched.
This is the first brand lesson: you don’t need to be “the best page on the internet” you need to be one of the pages that survives early filters.
**4) Pages are fetched fast, chunked, and scored**
Once pages are fetched, they’re split into chunks and compared against the query using embeddings/vector similarity (cosine similarity is a common approach).
This is a subtle but huge shift from traditional SEO thinking:
It’s not just “rank #1”.
It’s “does my page contain high-scoring chunks that match the user’s intent, in language the system can easily lift?”
**5) A smaller set of sources is selected, then the big model writes**
After scoring, only a handful of sources typically make the final cut for deeper use. Then the “writer” model synthesises an answer from the retrieved context plus its general knowledge.
Two practical implications:
a) Formatting and clarity win because chunking is brutally literal.
b) Consistency wins because generation is probabilistic (the same question can produce slightly different answers depending on what gets retrieved and how it’s synthesised).
**3 key takeaways for brands**
1) Your job is to win retrieval, not just rankings
If the pipeline is selecting a shortlist to read, you want to be eligible and obvious at the moment of selection.
What to do:
Own the “definition” and “comparison” queries in your category (the prompts people actually ask): “best for…”, “difference between…”, “is X good for Y…”, “how to choose…”.
Earn secondary coverage: retailers, review sites, credible community threads. The pipeline often blends multiple types of sources (not just brand sites), and third-party phrasing can be what gets selected.
The mental model: if you’re not in the candidate set, you can’t be cited, paraphrased, or recommended.
2) Make your content “chunk-friendly” (speed + structure + “answer blocks”)
If pages are fetched under tight latency constraints and then chunked, two things matter disproportionately: performance and information architecture.
What to do:
Improve Core Web Vitals and server responsiveness. Faster pages are easier to fetch completely, and better for users anyway.
Put the money section near the top: the clearest “answer block” for the intent (e.g., who it’s for, what problem it solves, key specs, proof points).
Use headings that mirror real prompts:
“Is it safe for sensitive skin?”
“Is it waterproof? How long does it last?”
“What certifications does it have?”
Write in extractable units: short paragraphs, bullets, mini-FAQs. Chunking and retrieval reward content that can stand alone.
If you do one thing this week: pick your top SKU page and add a tight FAQ section that answers the five most common “Rufus/ChatGPT questions” in plain language.
3) Treat structured data as your product’s machine-readable pitch
When systems are filtering candidates, anything that reduces ambiguity helps. Structured data does that: it labels entities (Product, Brand, Offer, Review, Availability) in a way machines can reliably parse.
What to do:
Implement Product structured data (price, availability, ratings, variants where appropriate).
Validate and monitor it (errors, missing fields, broken markup).
Keep entity naming consistent everywhere (brand name, SKU naming, variant naming). Your product should not look like five different entities across your site, Amazon, and PR coverage.
Use Schema.org vocabulary where relevant.
Structured data won’t magically make you “rank in ChatGPT” on its own, but it increases the odds you’re:
- correctly understood
- correctly compared
- safely surfaced in commercial answers
**Closing thought**
A useful way to think about this is: ChatGPT isn’t replacing search, it’s becoming a new front-end for it. And the brands that win won’t be the ones who “game prompts”; they’ll be the ones who build the cleanest, fastest, most clearly-structured set of pages (and third-party corroboration) for machines to retrieve and trust.