Amazon Optimisation
What Is COSMO?
COSMO (Common Sense Knowledge Generation and Serving System) is an industry‑scale system created at Amazon to automatically build common sense knowledge graphs that capture what shoppers really intend, not just what they type or click.
3 August 2025
8 min read
COSMO (Common Sense Knowledge Generation and Serving System) is an industry‑scale system created at Amazon to automatically build common sense knowledge graphs that capture what shoppers really intend, not just what they type or click.
Where conventional e‑commerce knowledge graphs store product attributes (e.g. “isA”, “colour”, “size”), COSMO goes deeper by mining user-centric intentions like:
Search‑buy behaviour: the queries users search and then purchase the resulting product.
Co‑buy behaviour: the products frequently bought together by users.
By combining user behaviour with large language models and human verification, COSMO captures patterns such as “customers search for winter coat because they want to stay warm” or “shoes bought by pregnant women are likely slip‑resistant”
How COSMO Works
1. Mining Intentions from Behaviour
COSMO begins by sampling millions of real user behaviour pairs (both search‑buy and co‑buy) across 18 product categories (e.g. clothing, electronics, baby supplies) (assets.amazon.science).
2. LLM‑Generated Common Sense
General-purpose LLMs are prompted to explain why users performed a behaviour, for example:
Query: “shoes for pregnant women”
LLM generates: “Pregnant women need slip‑resistant and supportive footwear.”
Each such explanation forms a candidate knowledge triple like (“pregnant women”, “usedFor_Audience”, “slip‑resistant shoes”) (assets.amazon.science).
3. Filtering & Human‑in‑the‑Loop
Since LLM outputs can be noisy or generic (“they like them”), COSMO applies:
Heuristic and similarity‑based filters to remove low‑value or generic candidates,
Human annotation (≈30,000 instructions) labelling the plausibility and typicality of examples,
Trained classifiers to retain only high‑quality common sense annotations
4. Instruction‑Tuned COSMO‑LM
Using this high‑quality instruction data, Amazon fine‑tunes a specialised, efficient LLM—COSMO‑LM—that can generate accurate common sense knowledge at scale and run efficiently in production.
5. Constructing the Knowledge Graph
The instruction‑tuned model is then used to generate millions of common sense triples across dozens of relations (e.g., usedFor, capableOf, isA, usedWith), producing a graph with over 6.3 million nodes, 29 million edges, and 15 relation types over Amazon’s 18 major product domains.
6. Deployment in Live Amazon Systems
COSMO’s graph is served in Amazon search and recommendation systems, powering:
Search relevance tuning,
Session‑based recommendation (predicting next products within a browsing session),
Search navigation enhancements.
Online A/B tests covering ~10 % of U.S. traffic show a 0.7 % lift in sales, translating into hundreds of millions of dollars of annual revenue.
Why COSMO Matters
🎯 1. Better Understanding of Intent
COSMO bridges the gap between what users type (“pregnancy shoes”) and what they mean (safe, slip‑resistant footwear). Traditional algorithms often miss this nuance.
⚙️ 2. Scalability with Human Alignment
By combining LLM generation, human‑annotated correction, and instruction fine‑tuning, COSMO scales to millions of knowledge facts with only 30k annotations, making it efficient and aligned with human preferences (Medium, assets.amazon.science).
💡 3. Enhanced Recommendation & Search
Common sense knowledge significantly improves downstream systems, better product relevance, smarter recommendations, more intuitive navigation.
📈 4. Business Impact
Even a small percentage lift in conversion or click-through can translate to enormous revenue gains. A 0.7 % sales increase among Amazon’s volume results in hundreds of millions in annual revenue.
🌍 5. Generalisable to Other Platforms
Though built for Amazon, the COSMO methodology - behaviour mining + LLM explanation + human feedback + instruction tuning - can inspire any e‑commerce platform aiming to understand shopper intent at scale.
Why You Should Know COSMO
For e‑commerce professionals: Understanding COSMO helps inform how to optimise product listings, titles and keywords but also by aligning with customer intent.
Final Thoughts
COSMO represents a pivotal shift in e‑commerce intelligence: from matching keywords to understanding why users behave the way they do. By mining intent from behaviour and codifying it into a vast, structured knowledge graph, Amazon has significantly upgraded its ability to interpret customer intent and serve them better.
Source: https://www.amazon.science/
Where conventional e‑commerce knowledge graphs store product attributes (e.g. “isA”, “colour”, “size”), COSMO goes deeper by mining user-centric intentions like:
Search‑buy behaviour: the queries users search and then purchase the resulting product.
Co‑buy behaviour: the products frequently bought together by users.
By combining user behaviour with large language models and human verification, COSMO captures patterns such as “customers search for winter coat because they want to stay warm” or “shoes bought by pregnant women are likely slip‑resistant”
How COSMO Works
1. Mining Intentions from Behaviour
COSMO begins by sampling millions of real user behaviour pairs (both search‑buy and co‑buy) across 18 product categories (e.g. clothing, electronics, baby supplies) (assets.amazon.science).
2. LLM‑Generated Common Sense
General-purpose LLMs are prompted to explain why users performed a behaviour, for example:
Query: “shoes for pregnant women”
LLM generates: “Pregnant women need slip‑resistant and supportive footwear.”
Each such explanation forms a candidate knowledge triple like (“pregnant women”, “usedFor_Audience”, “slip‑resistant shoes”) (assets.amazon.science).
3. Filtering & Human‑in‑the‑Loop
Since LLM outputs can be noisy or generic (“they like them”), COSMO applies:
Heuristic and similarity‑based filters to remove low‑value or generic candidates,
Human annotation (≈30,000 instructions) labelling the plausibility and typicality of examples,
Trained classifiers to retain only high‑quality common sense annotations
4. Instruction‑Tuned COSMO‑LM
Using this high‑quality instruction data, Amazon fine‑tunes a specialised, efficient LLM—COSMO‑LM—that can generate accurate common sense knowledge at scale and run efficiently in production.
5. Constructing the Knowledge Graph
The instruction‑tuned model is then used to generate millions of common sense triples across dozens of relations (e.g., usedFor, capableOf, isA, usedWith), producing a graph with over 6.3 million nodes, 29 million edges, and 15 relation types over Amazon’s 18 major product domains.
6. Deployment in Live Amazon Systems
COSMO’s graph is served in Amazon search and recommendation systems, powering:
Search relevance tuning,
Session‑based recommendation (predicting next products within a browsing session),
Search navigation enhancements.
Online A/B tests covering ~10 % of U.S. traffic show a 0.7 % lift in sales, translating into hundreds of millions of dollars of annual revenue.
Why COSMO Matters
🎯 1. Better Understanding of Intent
COSMO bridges the gap between what users type (“pregnancy shoes”) and what they mean (safe, slip‑resistant footwear). Traditional algorithms often miss this nuance.
⚙️ 2. Scalability with Human Alignment
By combining LLM generation, human‑annotated correction, and instruction fine‑tuning, COSMO scales to millions of knowledge facts with only 30k annotations, making it efficient and aligned with human preferences (Medium, assets.amazon.science).
💡 3. Enhanced Recommendation & Search
Common sense knowledge significantly improves downstream systems, better product relevance, smarter recommendations, more intuitive navigation.
📈 4. Business Impact
Even a small percentage lift in conversion or click-through can translate to enormous revenue gains. A 0.7 % sales increase among Amazon’s volume results in hundreds of millions in annual revenue.
🌍 5. Generalisable to Other Platforms
Though built for Amazon, the COSMO methodology - behaviour mining + LLM explanation + human feedback + instruction tuning - can inspire any e‑commerce platform aiming to understand shopper intent at scale.
Why You Should Know COSMO
For e‑commerce professionals: Understanding COSMO helps inform how to optimise product listings, titles and keywords but also by aligning with customer intent.
Final Thoughts
COSMO represents a pivotal shift in e‑commerce intelligence: from matching keywords to understanding why users behave the way they do. By mining intent from behaviour and codifying it into a vast, structured knowledge graph, Amazon has significantly upgraded its ability to interpret customer intent and serve them better.
Source: https://www.amazon.science/