Your website has a small file called robots.txt that tells automated visitors what they are allowed to access. For years it barely mattered to a dealer. Now it can quietly decide whether AI assistants are allowed to read, and therefore recommend, your inventory. A single wrong line can remove you from AI answers without anyone noticing.
The crawlers worth knowing
Different bots do different jobs. The important distinction for a dealer is between bots that help shoppers find you right now and bots that train models on your content. You will almost always want the first group in.
Retrieval and search bots (you want these)
- Googlebot, Google search, and the foundation for Google’s AI Overviews.
- Bingbot, Microsoft search and the backbone for Copilot.
- OAI-SearchBot, OpenAI’s search crawler for ChatGPT.
- Claude-SearchBot and ClaudeBot, Anthropic’s crawlers.
- PerplexityBot and Perplexity-User, for Perplexity answers.
- Applebot, Apple’s crawler, which feeds Siri and Spotlight.
Training bots (your call)
- GPTBot (OpenAI), Google-Extended, and CCBot (Common Crawl) gather content used to train models.
For most dealers the right posture is to allow everything. Your inventory is public and the goal is to be cited; there is little to gain from blocking, and real risk in blocking the wrong bot.
The trap: blocking Googlebot blocks more than Google
Here is the mistake that bites dealers. Some crawlers do not advertise their own identity and instead inherit Googlebot’s permissions. Brave’s search crawler is one. So is the retrieval path behind some assistants. If your robots.txt disallows Googlebot, even partially, you can silently delist yourself from search engines and AI backends you did not intend to touch. The safe rule: never disallow Googlebot on a page you want shoppers to reach.
A safe, simple robots.txt for a dealer
For a public inventory site, an allow-all posture that names the major bots is both legible and safe. The essentials look like this:
User-agent: *
Allow: /
# (optionally name the major bots for clarity)
User-agent: Googlebot
Allow: /
User-agent: GPTBot
Allow: /
Sitemap: https://yourdealer.example/sitemap.xmlKeep private areas (admin, staff logins) on separate paths or hosts rather than relying on a broad Disallow that might catch a bot you need. And always point to your sitemap so crawlers can find every listing.
Allowing a crawler is not the same as being readable
A crawler you allow still has to be able to read the page once it arrives. If your inventory loads via JavaScript, a permitted-but-non-rendering bot still sees nothing. Access and readability are two separate gates, and you need both. We cover the readability side in can AI see your inventory?
How VIN Index handles it
VIN Index generates a correct robots.txt for every dealer surface automatically: allow-all, every major AI and search bot named, Googlebot never disallowed, and your sitemap advertised, on both subdomains and custom domains. You do not maintain it. To see how your current site reads to these crawlers, run the free Analyzer.