Back to documentation

OpenAI user agents, explained.

Three crawlers, three jobs. Understand how ChatGPT accesses your site, how to control what it sees, and how to earn citations.

OpenAI · OfficialUpdated Apr 2026~8 min read3 user agents
TL;DR

OpenAI uses three main crawlers — GPTBot for AI training, OAI-SearchBot for ChatGPT search, and ChatGPT-User for direct user requests. Each serves a distinct purpose and can be controlled independently via robots.txt.

Overview

OpenAI uses several different user agents and web crawlers to interact with web content — from training AI models to serving search results inside ChatGPT. Understanding these agents is essential if you want to optimize for OpenAI's systems or control how your content is accessed.

How to identify the agents

3 crawlers

OpenAI identifies itself with specific user-agent strings. Verify each through a published IP range.

GB

GPTBot

AI training · crawler
Training
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot

Used for crawling content that may be used in training OpenAI's generative AI foundation models. This is the agent you block if you don't want your content used for model training.

Verifiablegptbot.json
SB

OAI-SearchBot

ChatGPT search · indexer
Search index
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot

Used for the search functionality in ChatGPT. Indexes content to return citations in ChatGPT's search features. Not used to crawl content for training models — these are separate systems.

Verifiablesearchbot.json
CU

ChatGPT-User

Direct user requests · on-demand
User-triggered
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot

Used when users ask ChatGPT or a Custom GPT to visit a specific URL. Not used for automatic crawling or AI training — only when a human explicitly asks ChatGPT to read a page.

How OpenAI accesses your content

3 access patterns, mapped to the agents above.

1

Training

GPTBot crawls content that may be used to train generative AI models.
2

Search

OAI-SearchBot indexes content to provide ChatGPT search citations.
3

On-demand

ChatGPT-User fetches a URL when a human asks ChatGPT to read it.
!
Heads up. For search results, expect roughly 24 hours between a robots.txt change and OpenAI's systems reflecting it.

Control OpenAI's access

robots.txt

You control each agent independently. Common configurations below.

Allow search · block trainingRecommended

Get cited, stay out of training data.
# Allow ChatGPT search, block training User-agent: GPTBot Disallow: / User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: /

Block all OpenAI accessRestrictive

Keeps OpenAI out of everything — no citations, no training.
# Block every OpenAI crawler User-agent: GPTBot Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ChatGPT-User Disallow: /

Optimize content for OpenAI.

Five rules that move the needle when ChatGPT decides whether to cite you.

Use clear, well-structured HTML. Proper semantic markup — headings, lists, tables — so LLMs can parse meaning without guessing.
Don't rely solely on JavaScript. Content should render in raw HTML. If it needs JS to appear, most crawlers won't see it.
Be comprehensive and factually accurate. ChatGPT cites sources that answer the question fully. Partial answers lose to complete ones.
Include metadata and schema markup. Article, FAQPage, and HowTo structured data raise citation odds.
Decide training-vs-search intentionally. Some pages belong in ChatGPT search; others shouldn't feed training. Split the rules.

Track OpenAI visits with xSeek.

See every OpenAI visit in real time.

Monitor GPTBot, OAI-SearchBot, and ChatGPT-User. Track which URLs they hit. Watch how your content surfaces in ChatGPT responses. Get notified when patterns shift.

Start free

Frequently asked questions

OpenAI uses three: GPTBot (GPTBot/1.1) for training models, OAI-SearchBot (OAI-SearchBot/1.0) for ChatGPT search, and ChatGPT-User (ChatGPT-User/1.0) for direct user-triggered fetches. All three are verifiable via their published IP ranges: openai.com/gptbot.json, openai.com/searchbot.json, and openai.com/chatgpt-user.json.
Source: information in this guide is drawn from OpenAI's official documentation.