When you search for something on Google in 2026, there's a good chance an AI-generated answer appears before the first links. These AI Overviews already cover 16% of Google searches. Add ChatGPT Search, Perplexity and Gemini, and a whole share of web traffic now plays out in AI answers rather than in lists of blue links.
The problem: your site might be invisible to these AI engines. Not because your content is bad, but because you don't speak their language. That's where llms.txt comes in, a simple file that could change the game. It's the approach we followed on optimycloud.com ourselves: our llms.txt file has been live since January 2026, and we support our clients on the subject.
In short
The llms.txt file is a Markdown file placed at the root of your site that guides AI engines toward your strategic content. Combined with GEO (Generative Engine Optimization), it lets you get cited in answers from ChatGPT, Perplexity and Google AI Overviews. Fewer than 1,000 sites worldwide have deployed it. Now is the time to be one of them.
The problem: AI engines don't read your site the way Google does
Google indexes your pages one by one. It follows links, reads the HTML, understands the structure. LLMs work differently. Their context window is limited. An entire site with its navigation, scripts and CSS is too much noise for too little signal.
The result: when ChatGPT or Perplexity looks for information in your field of expertise, it lands on your homepage packed with visual components and misses your high-value content, buried three clicks deeper.
What an LLM sees
- Complex HTML with navigation, scripts, CSS
- No priority hierarchy between pages
- Content drowned in technical markup
What llms.txt provides
- Clean Markdown, readable by AI engines
- Curated and prioritized content
- Direct links to strategic pages
llms.txt: the robots.txt of artificial intelligence
The llms.txt file was proposed in September 2024 by Jeremy Howard, co-founder of fast.ai and a major figure in deep learning. The idea is simple: just as robots.txt tells search engines what they can crawl, llms.txt tells AI engines where to find the content that matters.
It's a Markdown file placed at the root of the site (yoursite.com/llms.txt) with a machine-parsable structure:
The rules of the specification
- A single, mandatory H1: the name of your site or company
- Blockquote: a one-sentence summary (optional but recommended)
- H2 sections: content categories with lists of links in the format
[title](url): description - "Optional" section: secondary resources that AI engines can skip if context is limited
llms.txt vs llms-full.txt: what's the difference?
The specification defines two complementary files. Think of the first as a table of contents and the second as the complete book.
| Aspect | llms.txt | llms-full.txt |
|---|---|---|
| Content | Index with annotated links | Full documentation embedded |
| Typical size | 5,000 - 8,000 words | 35,000+ words |
| Use | Discovery and quick navigation | Exhaustive context with no navigation |
| Analogy | Annotated table of contents | The entire book |
Companies like Next.js, Stripe and Vercel already offer both files. Next.js goes even further with per-release versions (/docs/14/llms.txt, /docs/15/llms.txt).
AI crawlers: who visits your site and why
Before talking about optimization, you need to understand who these bots are. Unlike Googlebot, which does everything, AI companies run several distinct bots with different roles.
| Bot | Operator | Role |
|---|---|---|
| GPTBot | OpenAI | Collects data for model training |
| ChatGPT-User | OpenAI | Real-time retrieval for answers |
| OAI-SearchBot | OpenAI | Indexing for ChatGPT Search |
| ClaudeBot | Anthropic | Training and indexing |
| PerplexityBot | Perplexity | Indexing for the Perplexity engine |
| Google-Extended | Training Gemini (robots.txt token) |
Important point
OpenAI alone runs 4 different bots: GPTBot (training), ChatGPT-User (real-time answers), OAI-SearchBot (indexing) and ChatGPT Agent (autonomous browsing). Blocking GPTBot in your robots.txt doesn't necessarily block the others.
Adoption in 2026: where do we stand?
Let's be transparent: llms.txt is still in its early days. The numbers speak for themselves.
domains with an llms.txt worldwide
installs of the WordPress llms.txt plugin
AI systems that officially read it
Yes, you read that right: no major AI system officially reads llms.txt to date. Google's John Mueller confirmed it. Tests run by Semrush over 6 months detected no visits from GPTBot, ClaudeBot or PerplexityBot to the file.
So why bother? Because adoption by sites always precedes adoption by engines. It was the same for robots.txt in 1994, for Schema.org markup in 2011, for HTTPS in 2014. The companies that position themselves now will have an edge when AI engines start to use this file.
Who has already deployed it?
GEO: the real revolution behind llms.txt
The llms.txt file is only one piece. The overall strategy is called GEO (Generative Engine Optimization): optimizing your content to be cited in AI answers. It's the SEO of 2026. For SMEs looking to make the most of AI more broadly, we have published a practical guide to integrating generative AI in business.
Researchers from Princeton and Georgia Tech published the foundational GEO study, testing 9 optimization strategies across 10,000 queries. The results are clear: three techniques stand out sharply.
Cite reliable sources +30 to 40% visibility
Instead of writing "companies are increasingly using AI," write "according to McKinsey (2024), 72% of companies have adopted AI in at least one function." AI engines love verifiable sources.
Add precise statistics +30 to 40% visibility
Replace "many" with numbers. "The conversion rate rose by 23% in 3 months" is infinitely more citable than "results progressed significantly."
Include expert quotes +30 to 40% visibility
AI engines favor content with authoritative voices. A direct quote from an expert in your field adds weight to your content in generated answers.
What no longer works
Keyword stuffing, a pillar of 2010s SEO, is almost useless on generative engines. LLMs understand meaning, not repetition. Natural content rich in data beats over-optimized content.
SEO vs GEO: two different games
| Criterion | Classic SEO | GEO |
|---|---|---|
| Goal | Rank in a list of links | Get cited in an AI answer |
| Main lever | Keywords, backlinks, structure | Clarity, data, citations, accuracy |
| Visible result | Position in the SERP | Mention in the generated answer |
| Metrics | Position, CTR, impressions | Mentions, citations, sentiment |
| Conversion | Standard rate | 4.4x higher than organic traffic |
The conversion point is especially striking: visitors who arrive via AI search convert 4.4 times better than classic organic traffic. It makes sense: when ChatGPT recommends your service, the user arrives with a far higher level of trust than someone clicking a Google link. Pair this with a channel like WhatsApp to automate your customer relationship with AI, and the impact on your acquisition becomes significant.
Practical guide: setting up llms.txt and a GEO strategy
Check your robots.txt
First step: don't block AI crawlers. Make sure your robots.txt doesn't disallow GPTBot, ClaudeBot or PerplexityBot.
Create your llms.txt file
Place it at the root: yoursite.com/llms.txt. Select your 10 to 20 most strategic pages. No need to list everything: the goal is to guide, not to be exhaustive.
Enrich your content for GEO
On your strategic pages, add sourced statistics, expert quotes, and structure your content as question/answer. Google's AI Overviews love paragraphs that directly answer a question.
Create Markdown versions of your key pages
The specification recommends providing clean .md versions of your HTML pages. For example, yoursite.com/services.html.md for a cleaned-up Markdown version of your services page.
Test your AI visibility
Ask ChatGPT, Perplexity and Gemini questions about your field of expertise. Are you cited? Are your competitors? It's the best way to measure the impact of your GEO efforts.
WordPress: implementation in 2 minutes
If your site runs on WordPress, the "Website LLMs.txt" plugin (30,000+ installs) automatically generates the file from your existing content. It integrates with Yoast, Rank Math and SEOPress.
robots.txt, sitemap.xml, llms.txt: who does what
| File | Role | Audience | Status |
|---|---|---|---|
| robots.txt | Crawl permission / disallow | All crawlers | Standard |
| sitemap.xml | Exhaustive inventory of pages | Search engines | Standard |
| llms.txt | Curated guide to key content | LLMs and AI agents | Emerging |
Frequently asked questions
What is the llms.txt file?
A Markdown file placed at the root of your website that gives AI engines a structured summary of your content. Proposed by Jeremy Howard (fast.ai) in September 2024, it plays for LLMs the role that robots.txt plays for search engines.
Do AI engines really read llms.txt?
Not officially yet as of March 2026. But adoption is accelerating (950+ domains, 30,000+ WordPress installs) and major tech companies are positioning themselves. Preparing now means getting ahead before it becomes a standard.
What is the difference between llms.txt and llms-full.txt?
llms.txt is a compact index (an annotated table of contents); llms-full.txt contains the full documentation (the entire book). The first runs 5,000-8,000 words, the second 35,000+.
What is GEO?
Generative Engine Optimization is the practice of optimizing content to be cited by AI engines. Unlike SEO (ranking in a list), GEO aims to be the source mentioned in an answer generated by ChatGPT, Perplexity or Google's AI Overviews.
How do I know if AI engines are citing my site?
Ask ChatGPT, Perplexity and Gemini questions related to your field. Watch whether your brand, your articles or your data are cited. Tools like Semrush are starting to offer AI visibility metrics.
Conclusion: should you get started now?
llms.txt isn't a standard yet. No AI officially reads it. But that's exactly what people said about Schema.org in 2012, HTTPS in 2015, and voice search in 2018. The sites that positioned themselves early on those standards gained months of lead over their competitors.
The setup cost is negligible: a Markdown file at the root of your site, a few tweaks to your robots.txt, and some groundwork on the quality of your content. GEO goes further and requires rethinking how you write: less empty marketing, more verifiable data and sourced citations.
AI Overviews cover 16% of Google searches. Visitors from AI search convert 4.4 times better. The train is leaving the station. The question isn't whether AI engines will use llms.txt, but when.
Read also
Make your site visible to AI engines
AI visibility audit, llms.txt implementation, a complete GEO strategy or technical SEO optimization: I help you make your site cited, not just indexed.
Let's talk about your AI visibility