Free Robots.txt Generator

Build a valid robots.txt file in seconds. Set crawl rules for any bot, add your sitemap, and copy the output. Free, no sign-up required.

Configure Your Robots.txt

Leave the path empty for a default "Allow: /" (allow all). Use "/" to block or allow everything.

Strongly recommended. Helps all search engines find your sitemap.

Note: Googlebot ignores Crawl-delay. Only use if non-Google bots are overloading your server.

Explore More Free SEO Tools

More free tools to cover every aspect of your technical and on-page SEO.

How It Works

How to use this free robots.txt generator

No account needed, no sign-up required. Completely free. Configure your rules, add your sitemap, and copy a valid robots.txt file in seconds.

1

Select your user-agent

Choose which bot to target: all bots (*), Googlebot, Bingbot, or type a custom bot name. You can create multiple robot.txt blocks by running the tool again with different user-agents and combining the outputs.

2

Add Allow and Disallow rules

Add as many path rules as you need. Use Disallow to block bots from specific folders or files. Use Allow to grant access to paths within a blocked section. Leave paths blank to default to "Allow: /" which permits full access.

3

Copy and upload to your root

Copy the generated content and save it as a plain text file named exactly "robots.txt." Upload it to your website root directory so it is accessible at yourdomain.com/robots.txt. Verify access after uploading. Free, no account required.

The Syntax

How robots.txt files are structured

A robots.txt file uses a simple directive-based syntax. Each block begins with a User-agent and is followed by Allow or Disallow rules that apply to that bot.

Robots.txt Structure

User-agent: [bot] → Allow: [path] → Disallow: [path] → Sitemap: [url]

Example: User-agent: * / Disallow: /admin/ / Sitemap: https://example.com/sitemap.xml

Each robots.txt block starts with a User-agent line that specifies which bot the rules apply to. The wildcard * targets all bots simultaneously. Named bots like Googlebot or Bingbot can have their own separate rule blocks.

Allow and Disallow directives take a URL path as their value. The path is relative to your domain root. For example, Disallow: /private/ tells the bot not to crawl any URL starting with yourdomain.com/private/.

When both Allow and Disallow rules match a URL, the most specific rule wins. This lets you block a folder like /admin/ but allow a specific public page within it, like /admin/public-stats/.

The Sitemap directive is placed at the end of the file and points to the full URL of your XML sitemap. Multiple Sitemap directives are allowed, one per line, for sites with multiple sitemaps.

Directive Reference

Robots.txt directives and what they do

Use this table as a quick reference when writing or auditing a robots.txt file.

DirectiveValueWhat It Does
User-agent* or specific botSpecifies which bot the following rules apply to. Use * for all bots or name a specific crawler.
Allow/path/Explicitly permits access to a URL path even within a Disallowed section. More specific rules take precedence.
Disallow/private/Tells the specified bot not to crawl URLs matching this path. An empty Disallow means allow everything.
SitemapFull URLPoints crawlers to your XML sitemap. Helps bots discover all your important pages faster.
Crawl-delaySecondsAsks bots to wait between requests. Googlebot ignores this. Useful for some non-Google crawlers that overload servers.

Sources: Google Search Central, Robots Exclusion Protocol, 2026.

Bot Behavior Reference

How major bots respond to robots.txt rules

Not all crawlers behave the same way. Use this table to understand how the most common bots respond to your robots.txt directives.

BotRespects DisallowRespects Crawl-delayNotes
GooglebotYesNoPrimary Google crawler. Set rules in robots.txt and control indexing via noindex tags.
Google-ExtendedYesNoGoogle AI training crawler. Block with User-agent: Google-Extended + Disallow: / if needed.
BingbotYesYesMicrosoft Bing crawler. Respects Crawl-delay. Manageable via Bing Webmaster Tools.
AhrefsBotYesNoSEO research crawler. Block if you want to reduce crawl footprint from third-party tools.
SemrushbotYesNoSEMrush research crawler. Block with User-agent: SemrushBot + Disallow: /.
GPTBotYesNoOpenAI training crawler. Block with User-agent: GPTBot + Disallow: / to opt out of training data.

Bot behavior based on official documentation and technical SEO research, 2026.

What Kills Your Crawl Strategy

Six robots.txt mistakes that damage your SEO

These mistakes range from catastrophic to quietly corrosive. All of them are completely preventable with the right knowledge.

🚫

Blocking your entire site accidentally

Adding "Disallow: /" for all user-agents is the most catastrophic robots.txt error. It tells every search engine to stop crawling your site. Pages drop out of search results within weeks. Always double-check your rules before uploading, especially on production servers.

Always test with Google Search Console after updating
🔐

Relying on robots.txt for security

Robots.txt is a public file. Anyone can read it. Blocking a URL in robots.txt does not hide sensitive content. It actually advertises the existence of those URLs to anyone who checks your robots.txt. Use server-level authentication or access controls to protect sensitive pages.

robots.txt is public: never list sensitive paths
🔗

Blocking CSS and JavaScript files

Google needs to render your pages to evaluate their content and user experience. Blocking CSS, JavaScript, or font files in robots.txt prevents Google from seeing what your page looks like. This can hurt your Core Web Vitals scores and ranking potential significantly.

Allow all CSS, JS, and font resources for rendering
📄

Forgetting to add your sitemap

Your robots.txt file is one of the fastest ways to surface your sitemap to all search engines simultaneously. Skipping the Sitemap directive means crawlers have to find your sitemap through other means, which can delay indexing of new content by days or weeks.

Add Sitemap: URL to every robots.txt file
⚠️

Using Disallow to remove indexed pages

Disallow prevents crawling, not de-indexing. If a page is already in Google index and you add a Disallow rule, Google may still show it in search results. To remove an indexed page, you need a noindex meta tag or the Google Search Console URL removal tool, not robots.txt.

Use noindex to de-index, not Disallow
🧪

Never testing the file after changes

A single syntax error can silently break your entire robots.txt file, causing bots to fall back to default behavior. After every change, test your robots.txt using Google Search Console robots.txt tester, verify the file is accessible at /robots.txt, and check for crawl errors in Search Console within 48 hours.

Test every change in Google Search Console

Optimize Your Crawl Strategy

8 tips to get more from your robots.txt and crawl budget

A good robots.txt file protects your crawl budget and gives search engines a clear path to your best content. All CommonNinja widgets are free to start.

01

Block resource-wasting bot traffic with targeted rules

Third-party SEO bots, content scrapers, and AI training crawlers can consume significant server bandwidth. Add specific User-agent blocks for bots like AhrefsBot, SemrushBot, or GPTBot with Disallow: / if you do not want them crawling your site.

02

Block admin, checkout, and login paths

Paths like /admin/, /checkout/, /cart/, /login/, and /account/ should always be blocked for all bots. These pages add no SEO value, waste crawl budget, and expose internal functionality. Block them globally with User-agent: * and Disallow rules.

03

Use accordion FAQs to maximize indexed content

Accordion FAQ sections pack keyword-rich content into collapsible panels that search engines can fully index when JavaScript rendering is allowed. Pair open crawl rules for your main content with accordion widgets to build topical authority.

Try Accordion widget
04

Organize crawlable content with tabs

Tab widgets keep multiple content sections on a single URL that crawlers can index as one page. This reduces URL sprawl, keeps your crawl budget focused, and gives search engines more content per URL without creating unnecessary duplicate pages.

Try Tabs widget
05

Block URL parameters that create duplicate content

E-commerce filter parameters like ?sort=price, ?color=red, and ?page=2 often create thousands of near-duplicate URLs. Block these in robots.txt with Disallow: /*?* or handle them with canonical tags to prevent crawl budget waste on low-value parameterized pages.

06

Build comparison pages bots can fully index

Comparison table widgets create structured, keyword-rich pages that earn featured snippets and high-intent organic traffic. Make sure comparison page URLs are in your allowed crawl paths and included in your sitemap for fast indexing.

Try Comparison Tables widget
07

Keep fresh content crawlable with dynamic feeds

Content feed widgets generate new pages regularly. Ensure feed URLs follow a consistent pattern that your robots.txt allows. Add the feed index pages to your sitemap so search engines discover new content as it is published, not weeks later.

Try Feeds widget
08

Audit robots.txt after every major site change

Site redesigns, CMS migrations, and new feature launches often introduce new URL structures. After any major change, review your robots.txt to ensure new paths are correctly allowed or blocked. A stale robots.txt that blocks new pages kills rankings before they ever start.

Technical SEO Glossary

Key crawl and indexing terms explained

Robots.txt is part of a broader technical SEO ecosystem. Here is how the key concepts relate and when each one matters.

TermDefinitionFormat / SyntaxWhen to Use
robots.txtA plain text file at your site root that communicates crawl permissions to bots. It follows the Robots Exclusion Protocol. Every website should have one.yourdomain.com/robots.txtControlling bot crawl access across your entire site
Crawl BudgetThe number of pages Googlebot will crawl on your site within a given timeframe. Sites with large crawl budgets get new content indexed faster. Wasted crawl budget on low-value pages delays indexing of important content.Crawl capacity x Crawl demandOptimizing which pages get crawled on large or complex sites
noindexAn HTML meta tag or HTTP header that tells search engines not to include a page in their index. Unlike Disallow, noindex lets bots crawl the page but prevents it from appearing in search results.<meta name="robots" content="noindex">Removing specific pages from search results while allowing crawling
XML SitemapAn XML file listing all the important URLs on your site. It helps search engines discover content faster, especially for large sites or pages with few internal links.yourdomain.com/sitemap.xmlEnsuring all important pages are discoverable and crawled regularly
Crawl DepthThe number of clicks from the homepage required to reach a given page. Pages buried at depth 4 or deeper receive less crawl frequency than pages near the homepage.Click depth from homepageImproving crawl equity distribution by flattening site architecture

FAQ

A robots.txt file is a plain text file placed at the root of your website that tells search engine crawlers which pages or sections they are allowed to crawl and index. It follows the Robots Exclusion Protocol and is read by bots before they crawl your site.
Robots.txt prevents crawling, but it does not guarantee pages will be removed from Google search results. If a blocked page has external backlinks, Google may still index it without crawling the content. To remove a page from search results entirely, use a noindex meta tag or the Google Search Console URL removal tool.
Disallow in robots.txt tells crawlers not to visit a URL. The noindex meta tag tells crawlers not to include a page in search results even if they do visit it. For pages you want completely hidden from search, use noindex. Disallow alone is not enough to guarantee de-indexing.
If you add "Disallow: /" for Googlebot, Google cannot crawl your site. Your pages will eventually drop from search results entirely. This is useful for staging sites or internal tools, but should never happen on a live production website accidentally.
Yes. Adding your sitemap URL to robots.txt is a widely recommended best practice. It helps search engine crawlers find your sitemap automatically, even if you have not submitted it manually through Google Search Console or Bing Webmaster Tools.
Yes, completely free. No account, no sign-up, no limits. Configure your rules and copy the output instantly.
Crawl-delay tells a bot how many seconds to wait between requests. Googlebot ignores this directive. Other bots like Bingbot may respect it. Use it only if your server is struggling under crawler load. Setting it too high can slow down indexing of your content.

Trusted by