Robots.txt Generator Online
Build a robots.txt file for your website in seconds. Set user-agent rules, specify disallowed and allowed paths, and add your sitemap URL. No signup, runs entirely in your browser.
User-agent: * Disallow: /admin/ Sitemap: https://example.com/sitemap.xml
How the Robots.txt Generator Works
- 1Enter your sitemap URL (optional but recommended for SEO).
- 2Click Add Rule and choose a user-agent — use
*for all bots, or a specific name likeGooglebot. - 3Enter the paths you want to disallow or allow. Use the quick-path buttons to insert common values like
/admin/or/api/. - 4Click Copy to copy the generated robots.txt, then paste it into your site's root directory as
robots.txt.
robots.txt Syntax Reference
The Robots Exclusion Protocol supports a small set of directives. Each user-agent block starts with User-agent: followed by the bot name, then one or moreDisallow: or Allow: lines. An empty Disallow:line means allow everything for that user-agent. Paths are case-sensitive and must start with /.
Robots.txt Best Practices
Always Include a Sitemap
The Sitemap: directive tells search engines where to find your XML sitemap, helping them discover all your pages faster.
Block AI Crawlers Selectively
Add separate rules for GPTBot, ClaudeBot, and CCBot to opt out of AI training data collection without affecting Googlebot.
Disallow Admin Paths
Block /admin/, /login, and /dashboard to save crawl budget and keep internal pages out of search indices.
Use Wildcards Carefully
The * wildcard matches any sequence of characters. Disallow: /*.json$ blocks all URLs ending in .json. Test with Google Search Console after deploying.
Validate After Deployment
Use Google Search Console's robots.txt tester to verify your rules work as expected before they affect crawling in production.
Don't Rely on It for Security
robots.txt is a courtesy document. Malicious bots ignore it. Protect sensitive content with authentication, not just a Disallow directive.
Frequently Asked Questions
What is a robots.txt file?
A robots.txt file is a plain-text file placed at the root of your website (e.g. https://example.com/robots.txt) that instructs web crawlers which pages or directories they are allowed or not allowed to access. It follows the Robots Exclusion Protocol.
Does robots.txt prevent pages from appearing in search results?
No. Disallowing a URL in robots.txt prevents crawlers from fetching that page, but it does not prevent the URL from appearing in search results if other pages link to it. To remove a URL from search results entirely, use a noindex meta tag or the X-Robots-Tag HTTP header on the page itself.
What is the difference between Disallow and Allow?
Disallow tells the crawler not to access the specified path. Allow overrides a broader Disallow rule for a specific sub-path. For example, you can Disallow /private/ while Allow-ing /private/public-file.pdf within the same user-agent block.
What does "User-agent: *" mean?
"User-agent: *" applies the rules to all web crawlers that honour the Robots Exclusion Protocol. You can add specific rules for individual bots — such as Googlebot, Bingbot, or GPTBot — by using their specific user-agent name.
Will blocking a bot with robots.txt stop it from crawling?
Well-behaved crawlers (Googlebot, Bingbot, etc.) respect robots.txt voluntarily. Malicious bots or scrapers may ignore it entirely. For security-sensitive content, use authentication or server-level access controls, not robots.txt.
Where should I place my robots.txt file?
The file must be placed at the root of your domain — https://www.example.com/robots.txt. Subdirectory placement (e.g. /blog/robots.txt) is not supported by the protocol and will be ignored.
Is my data private when using this tool?
Yes. Everything runs in your browser. No data is sent to any server. The tool generates the robots.txt content locally from your inputs.