Quick Answer
llms.txt is a Markdown file placed at the root of a website that provides large language models like ChatGPT, Claude, Gemini, and Perplexity with a structured summary of the site's most important content, so AI systems can understand what the site offers without crawling every page. Jeremy Howard of Answer.AI proposed the standard in 2024.
What is llms.txt?
llms.txt is a Markdown file placed at the root of a website that provides large language models (LLMs) like ChatGPT, Claude, Gemini, and Perplexity with a structured summary of the site's most important content, so AI systems can understand what the site offers without crawling every page.
The problem it solves is simple. Your website probably has thousands of pages: docs, blog posts, legal disclaimers, cookie policies, old press releases. An LLM crawling your site has no way of knowing which pages actually matter. llms.txt tells it.
Jeremy Howard, co-founder of fast.ai and Answer.AI and creator of the ULMFiT transfer learning method, proposed the llms.txt standard in September 2024. His reasoning was practical: LLMs have finite context windows. They cannot ingest an entire large site. A short, well-structured file that says "here are the 20 pages worth reading" saves the model from guessing.
Adoption is still early but growing fast. As of March 2026:
- 7.4% of Fortune 500 companies have an llms.txt file (ProGEO.ai, March 2026)
- 10.13% of nearly 300,000 surveyed domains have one (SE Ranking)
- 92.8% of the Fortune 500 have robots.txt, for comparison
We are nowhere near mainstream yet, but the trajectory is clear.
llms.txt vs robots.txt
People often confuse these because of the naming, but they do completely different things.
- SiteSpeakAI Generator (Free) — Crawls your site and generates llms.txt
- WordLift Generator (Free tier) — Generates llms.txt from site structure
- Yoast SEO (Yoast Premium) — Built-in llms.txt for WordPress
- Mintlify (Built-in) — Auto-generates for documentation sites
- TopicPen AI Visibility Checker (Free) — Validates llms.txt as part of a full AI visibility audit
robots.txt says "stay out of these rooms." llms.txt says "these are the rooms worth visiting." As Search Engine Land put it: "llms.txt is not robots.txt. It is a treasure map for AI."
In short: robots.txt controls access, sitemap.xml handles discovery, and llms.txt handles curation. A website that wants to be found by both traditional search engines and AI models should have all three.
Why llms.txt matters for AI visibility
Gartner projects a 25% decline in organic search traffic to commercial websites by the end of 2026, as users move discovery to AI platforms. When someone asks ChatGPT for a project management recommendation or asks Perplexity how CDN caching works, the answer comes from web content the AI has already ingested. If it never ingested your content properly, you are not in the running. For more on optimizing for AI search, see our guide on generative engine optimization.
llms.txt helps with this in a few ways.
First, it filters out the noise. Without guidance, an LLM might treat your cookie policy and your product documentation as equally important. llms.txt tells it which pages actually represent your business.
Second, it works within real constraints. Even GPT-4o's 128K-token context window cannot hold every page of a large documentation site. A short summary file with links to full content lets the model load what it needs without drowning.
And honestly, there is something more basic going on: a well-structured llms.txt file signals that someone is paying attention to the site. That is not a ranking factor anyone has measured, but it makes content easier for AI to parse, and easier to parse usually means easier to cite.
I should be upfront about something, though. SE Ranking's research across nearly 300,000 domains found no statistical correlation between having an llms.txt and being cited more by AI. Google's Gary Illyes has compared llms.txt to the keywords meta tag, suggesting Google does not use it as a ranking signal. The standard is too young and adoption too thin for the data to mean much yet. This is a bet on where things are heading, not a proven tactic.
For a broader look at how AI engines evaluate your site, see our guide on what AI visibility is and why it matters.
The llms.txt format and spec
The official specification is refreshingly short. An llms.txt file is just a Markdown document with a specific structure:
# Your Site Name
> A one-paragraph summary of what your site does. Keep this concise
> and information-dense. This is what the LLM reads first.
Additional context about your project, company, or product.
This can be a few sentences expanding on the summary above.
## Section Name
- [Page Title](https://yoursite.com/page-url): A brief description of what this page covers
- [Another Page](https://yoursite.com/another-page): What this resource is about
## Another Section
- [Resource Name](https://yoursite.com/resource): Description
- [Guide Name](https://yoursite.com/guide): Description
## Optional
- [Less Critical Resource](https://yoursite.com/extra): Description of supplementary content
Required and optional elements
Format rules
A few things to get right:
- Name the file exactly
llms.txt. Notllm.txt, notLLMS.txt. - Put it at the root of your domain:
yoursite.com/llms.txt. - Stick to Markdown. No HTML, no XML.
- One sentence per link description. Enough to explain what the page is, nothing more.
- The
## Optionalheading is a literal keyword. It tells AI that those resources are supplementary.
llms-full.txt: the extended version
Some sites also publish an llms-full.txt alongside the standard file. Where llms.txt links to pages with short descriptions, llms-full.txt contains the actual full text of every page, flattened into one document. Cloudflare does this: a short llms.txt directory plus a massive llms-full.txt with every documentation page.
This is mostly useful for AI coding assistants that want to load an entire knowledge base into context at once. If you are not running a developer documentation site, the standard llms.txt is enough.
How to create an llms.txt file
This takes about 30 minutes. Even for large sites.
Step 1: Pick your top 10-30 pages
Not everything belongs in llms.txt. Think about which pages you would send someone who asked "what does your company do?" Those are your candidates:
- Homepage or main product page
- Core documentation or guides
- Pricing page (if you have one)
- Top evergreen blog posts (not time-sensitive news)
- API reference (for developer-facing sites)
- About page (this helps AI establish who you are as an entity)
Step 2: Write the header
Start with your site name and a short blockquote summary:
# TopicPen
> TopicPen helps businesses improve their visibility in AI-powered search
> engines like ChatGPT, Gemini, and Perplexity. Tools include an AI Visibility
> Checker, word analyzer, and automated content pipeline.
Keep this under 50 words. Put the most important information first. This is what the LLM reads to decide whether your site is relevant to whatever someone just asked it.
Step 3: Organize into sections
Group your pages in whatever way makes sense for your site. SaaS companies tend to group by product. Content sites group by topic. Documentation sites group by content type (guides, API reference, tutorials). There is no single right way.
Step 4: Write link descriptions
One sentence per link. Describe what the page is, not why it is great:
## Tools
- [AI Visibility Checker](https://go.topicpen.com/en/tools/ai-visibility-checker): Free tool that analyzes how visible your website is to ChatGPT, Gemini, Perplexity, and Google AI Overviews across 9 dimensions
- [Word Analyzer](https://go.topicpen.com/en/tools/word-analyzer): Analyzes Hebrew text for readability, word frequency, and SEO metrics
Skip the marketing language. AI models do not care that your tool is "industry-leading." They care what it does.
Step 5: Upload and verify
- Save the file as
llms.txt(plain text, UTF-8 encoding). - Upload to your site's root directory.
- Check that it loads at
https://yoursite.com/llms.txt. - Validate the format with TopicPen's AI Visibility Checker, which includes llms.txt validation as part of its 9-dimension audit.
Step 6: Keep it updated
This is the part people forget. When you launch a new product, add it. When you retire a feature, remove it. A stale llms.txt that points AI to deprecated pages is arguably worse than having no file at all.
Real-world llms.txt examples
Three companies worth looking at, each with a different approach.
Cloudflare
Cloudflare went with a two-level hierarchy. Their root llms.txt at developers.cloudflare.com/llms.txt is basically a table of contents listing every product (Workers, Pages, R2, etc.). Each product then has its own llms.txt file with the actual page-level links. They also ship llms-full.txt files with the complete text of every doc page.
This works well if you have a lot of documentation. A model can scan the root file, figure out which product is relevant, and then load just that product's file.
Stripe
Stripe organizes by product category: Payments, Billing, Connect, Terminal, Identity. Each entry links to that product's documentation root with a one-line description. They put niche tools like Stripe Climate under an ## Optional section.
Straightforward and easy to maintain. If someone asks an AI about Stripe billing, the model can find the right section quickly.
Zapier
Zapier organized theirs around API endpoints and integration categories, which makes sense for their platform. Developers using Zapier are usually looking for a specific app integration, and the llms.txt is structured exactly that way.
Tools for generating and validating llms.txt
You do not have to write it by hand. Several tools can generate or validate an llms.txt file:
The generators are useful as a starting point, but I would review the output rather than publishing it blindly. Auto-generated files tend to include too many pages and too little description.
For validation, TopicPen's AI Visibility Checker checks whether your llms.txt exists, follows the spec, and is properly structured. It also audits the rest of your AI visibility profile: crawler access, schema markup, E-E-A-T signals, citability, and platform readiness across ChatGPT, Gemini, Perplexity, and Google AI Overviews.
What to do next
The whole process is: pick your 15-20 best pages, organize them into sections, write one-sentence descriptions, upload the file. That is it.
Once it is live, run a free AI visibility check to validate the file and see how the rest of your site scores on crawler access, structured data, citability, E-E-A-T, and platform readiness.
Whether llms.txt becomes as standard as robots.txt or fades into obscurity, the 30 minutes it takes to create one is not going to be time you regret spending.
Frequently Asked Questions
What is llms.txt?
llms.txt is a Markdown file placed at the root of a website that gives AI models like ChatGPT, Claude, Gemini, and Perplexity a structured map of the site's most important content. Jeremy Howard of Answer.AI proposed the standard in 2024. Instead of crawling an entire site, the model reads this file to understand what the site offers.
Does llms.txt replace robots.txt?
No. llms.txt does not replace robots.txt. robots.txt controls crawler access (which pages bots can and cannot visit). llms.txt curates content for AI models (which pages are most important). They serve different purposes. A site that wants full AI discoverability should have both files, plus a sitemap.xml.
Will llms.txt help me rank higher in AI search?
As of early 2026, there is no proven correlation between having an llms.txt file and higher AI citation rates. SE Ranking's study of nearly 300,000 domains found no statistical link. However, the standard takes about 30 minutes to implement and makes content easier for AI to parse. The downside risk is close to zero.
How often should I update my llms.txt?
Whenever you add or remove anything significant. A monthly review works for most sites. New product launch? Add it. Deprecated feature? Remove it.
Where should I put my llms.txt file?
Root of your domain: yoursite.com/llms.txt. If you have a separate docs subdomain, you can put one there too (docs.yoursite.com/llms.txt).
Do I need llms-full.txt too?
Probably not, unless you run a developer documentation site. llms-full.txt contains the full text of every page and is mainly used by AI coding assistants. For business websites, the standard llms.txt is enough.
How do I validate my llms.txt?
TopicPen's free AI Visibility Checker validates your llms.txt format and checks your overall AI visibility score across ChatGPT, Gemini, Perplexity, and Google AI Overviews.
More guides you might like

Ola Tzur
Digital marketing, web, and SEO expert since 2010, working with AI since 2022. Founder of TopicPen — a platform helping businesses generate more leads and sales with AI chatbots.
Read more →This article was created with AI assistance.


