LLMS.txt? not yet

TL;DR llms.txt is not currently widely adopted or officially recognized by major LLM providers or AI search engines. While the llms file a proposed standard and some sites have implemented it, Google and other major players have stated that they are not using it. If it becomes a widely adopted standard we’ll look to add it to our plugins.

What is `llms.txt`?

It’s a proposed standard (similar to robots.txt for traditional search engine crawlers) designed to help LLMs and AI agents understand and process website content more effectively.
The idea is to provide a clean, structured, and LLM-friendly version of a website’s important content, often in Markdown format, devoid of ads, navigation, and other extraneous HTML elements.
It’s intended to give LLMs a “curated map” of high-value content, such as API documentation, return policies, or key articles, to improve the accuracy and relevance of AI-generated responses.

Why it’s NOT a “thing” (yet):

Lack of Official Adoption: Major LLM providers like Google (for Gemini/Bard), OpenAI (for GPTBot), and Meta (for LLaMA) have explicitly stated that they do not currently use or check for llms.txt. They primarily rely on existing web standards like robots.txt and sitemaps, along with their advanced crawling and understanding capabilities.
Redundancy: As Google’s John Mueller has pointed out, if AI bots already download full web pages and structured data, why would they need a separate file? They can already extract the necessary information.
Potential for Abuse: There’s a concern that llms.txt could be abused to show AI bots one version of content while users see another, leading to cloaking issues.
User Experience Concerns: If LLMs were to cite llms.txt files directly, users clicking on those citations might land on bare text files without proper formatting or navigation, leading to a poor user experience.
Limited Observed Benefit: Community feedback from early adopters of llms.txt has shown little to no activity from major AI crawlers accessing the file. While some minor increases in “LLM traffic” have been reported by very few sites, it’s often attributed to the overall growth of AI usage rather than the specific influence of llms.txt.

What LLMs and AI Engines do care about (and what you should focus on):

Instead of llms.txt, focus your efforts on these established and proven strategies for optimizing content for AI (TL;DR: be awesome at SEO):

Structured Data : This is paramount. Use markup to explicitly define the entities, relationships, and meaning of your content. This gives LLMs clear, machine-readable information about your pages. (This is why SEO for AI has focused here)
High-Quality, Well-Structured Content:
- Clarity and Conciseness: Write clearly and avoid jargon.
- Logical Headings and Subheadings: Use <h1>, <h2>, etc., to create a clear hierarchy.
- Semantic HTML: Use appropriate HTML tags (<article>, <section>, <ul>, <ol>, <p>, etc.) to convey meaning.
E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness): While not a technical tag, creating content that demonstrates these qualities is crucial for LLMs to consider your information reliable and trustworthy.
robots.txt and Sitemaps: Continue to use these for traditional crawl management and indicating what content is available on your site.
Technical SEO Best Practices: Fast loading times, mobile-friendliness, secure (HTTPS) sites, and a clean code base all contribute to better crawlability and understanding by any bot, including AI ones.

In summary: While llms.txt is an interesting proposal, it has not gained traction with the major AI players. Investing time and resources into it at this point would be a misallocation. Focus on creating high-quality, semantically rich content using established structured data formats and good web development practices.

What is llms.txt?

Why it’s NOT a “thing” (yet):

What LLMs and AI Engines do care about (and what you should focus on):

Latest News

What is `llms.txt`?