Free Robots.txt Generator

Q: What happens if I block Googlebot from my entire site?

If you add 'Disallow: /' for Googlebot, Google cannot crawl your site. Your pages will eventually drop from search results entirely. This is useful for staging sites or internal tools, but should never happen on a live production website accidentally.

Build a valid robots.txt file in seconds. Set crawl rules for any bot, add your sitemap, and copy the output. Free, no sign-up required.

Configure Your Robots.txt

User-Agent

Crawl Rules

Leave the path empty for a default "Allow: /" (allow all). Use "/" to block or allow everything.

Sitemap URL (optional)

Strongly recommended. Helps all search engines find your sitemap.

Crawl-Delay in seconds (optional)

Note: Googlebot ignores Crawl-delay. Only use if non-Google bots are overloading your server.

Explore More Free SEO Tools

More free tools to cover every aspect of your technical and on-page SEO.

Free Keyword Density Checker

Paste your content and check keyword density instantly. Optimize without over-stuffing.

Use Tool →

Free Readability Score Calculator

Analyze your content readability with Flesch Reading Ease and grade-level scores.

Use Tool →

Free Title Length Checker

Check if your page title fits Google SERP limits and preview how it looks in search results.

Use Tool →

Free Meta Tag Generator

Generate perfectly formatted title and meta description tags for any page.

Use Tool →

Free FAQ Schema Generator

Build valid FAQ JSON-LD schema markup to unlock rich results in Google search.

Use Tool →

Free OG Tag Generator

Create Open Graph and Twitter Card meta tags to control how your pages look when shared.

Use Tool →

Free URL Slug Generator

Turn any title into a clean, SEO-friendly URL slug in seconds.

Use Tool →

Free Heading Structure Analyzer

Analyze your heading hierarchy, catch missing H1s, and fix structural SEO issues.

Use Tool →

Free Word Counter

Count words, characters, sentences, paragraphs, and get reading time estimates instantly.

Use Tool →

How It Works

How to use this free robots.txt generator

No account needed, no sign-up required. Completely free. Configure your rules, add your sitemap, and copy a valid robots.txt file in seconds.

Select your user-agent

Choose which bot to target: all bots (*), Googlebot, Bingbot, or type a custom bot name. You can create multiple robot.txt blocks by running the tool again with different user-agents and combining the outputs.

Add Allow and Disallow rules

Add as many path rules as you need. Use Disallow to block bots from specific folders or files. Use Allow to grant access to paths within a blocked section. Leave paths blank to default to "Allow: /" which permits full access.

Copy and upload to your root

Copy the generated content and save it as a plain text file named exactly "robots.txt." Upload it to your website root directory so it is accessible at yourdomain.com/robots.txt. Verify access after uploading. Free, no account required.

The Syntax

How robots.txt files are structured

A robots.txt file uses a simple directive-based syntax. Each block begins with a User-agent and is followed by Allow or Disallow rules that apply to that bot.

Robots.txt Structure

User-agent: [bot] → Allow: [path] → Disallow: [path] → Sitemap: [url]

Example: User-agent: * / Disallow: /admin/ / Sitemap: https://example.com/sitemap.xml

Each robots.txt block starts with a User-agent line that specifies which bot the rules apply to. The wildcard * targets all bots simultaneously. Named bots like Googlebot or Bingbot can have their own separate rule blocks.

Allow and Disallow directives take a URL path as their value. The path is relative to your domain root. For example, Disallow: /private/ tells the bot not to crawl any URL starting with yourdomain.com/private/.

When both Allow and Disallow rules match a URL, the most specific rule wins. This lets you block a folder like /admin/ but allow a specific public page within it, like /admin/public-stats/.

The Sitemap directive is placed at the end of the file and points to the full URL of your XML sitemap. Multiple Sitemap directives are allowed, one per line, for sites with multiple sitemaps.

Directive Reference

Robots.txt directives and what they do

Use this table as a quick reference when writing or auditing a robots.txt file.

Directive	Value	What It Does
User-agent	* or specific bot	Specifies which bot the following rules apply to. Use * for all bots or name a specific crawler.
Allow	/path/	Explicitly permits access to a URL path even within a Disallowed section. More specific rules take precedence.
Disallow	/private/	Tells the specified bot not to crawl URLs matching this path. An empty Disallow means allow everything.
Sitemap	Full URL	Points crawlers to your XML sitemap. Helps bots discover all your important pages faster.
Crawl-delay	Seconds	Asks bots to wait between requests. Googlebot ignores this. Useful for some non-Google crawlers that overload servers.

Sources: Google Search Central, Robots Exclusion Protocol, 2026.

Bot Behavior Reference

How major bots respond to robots.txt rules

Not all crawlers behave the same way. Use this table to understand how the most common bots respond to your robots.txt directives.

Bot	Respects Disallow	Respects Crawl-delay	Notes
Googlebot	Yes	No	Primary Google crawler. Set rules in robots.txt and control indexing via noindex tags.
Google-Extended	Yes	No	Google AI training crawler. Block with User-agent: Google-Extended + Disallow: / if needed.
Bingbot	Yes	Yes	Microsoft Bing crawler. Respects Crawl-delay. Manageable via Bing Webmaster Tools.
AhrefsBot	Yes	No	SEO research crawler. Block if you want to reduce crawl footprint from third-party tools.
Semrushbot	Yes	No	SEMrush research crawler. Block with User-agent: SemrushBot + Disallow: /.
GPTBot	Yes	No	OpenAI training crawler. Block with User-agent: GPTBot + Disallow: / to opt out of training data.

Bot behavior based on official documentation and technical SEO research, 2026.

What Kills Your Crawl Strategy

Six robots.txt mistakes that damage your SEO

These mistakes range from catastrophic to quietly corrosive. All of them are completely preventable with the right knowledge.

🚫

Blocking your entire site accidentally

Adding "Disallow: /" for all user-agents is the most catastrophic robots.txt error. It tells every search engine to stop crawling your site. Pages drop out of search results within weeks. Always double-check your rules before uploading, especially on production servers.

Always test with Google Search Console after updating

🔐

Relying on robots.txt for security

Robots.txt is a public file. Anyone can read it. Blocking a URL in robots.txt does not hide sensitive content. It actually advertises the existence of those URLs to anyone who checks your robots.txt. Use server-level authentication or access controls to protect sensitive pages.

robots.txt is public: never list sensitive paths

🔗

Blocking CSS and JavaScript files

Google needs to render your pages to evaluate their content and user experience. Blocking CSS, JavaScript, or font files in robots.txt prevents Google from seeing what your page looks like. This can hurt your Core Web Vitals scores and ranking potential significantly.

Allow all CSS, JS, and font resources for rendering

📄

Forgetting to add your sitemap

Your robots.txt file is one of the fastest ways to surface your sitemap to all search engines simultaneously. Skipping the Sitemap directive means crawlers have to find your sitemap through other means, which can delay indexing of new content by days or weeks.

Add Sitemap: URL to every robots.txt file

⚠️

Using Disallow to remove indexed pages

Disallow prevents crawling, not de-indexing. If a page is already in Google index and you add a Disallow rule, Google may still show it in search results. To remove an indexed page, you need a noindex meta tag or the Google Search Console URL removal tool, not robots.txt.

Use noindex to de-index, not Disallow

🧪

Never testing the file after changes

A single syntax error can silently break your entire robots.txt file, causing bots to fall back to default behavior. After every change, test your robots.txt using Google Search Console robots.txt tester, verify the file is accessible at /robots.txt, and check for crawl errors in Search Console within 48 hours.

Test every change in Google Search Console

Optimize Your Crawl Strategy

8 tips to get more from your robots.txt and crawl budget

A good robots.txt file protects your crawl budget and gives search engines a clear path to your best content. All CommonNinja widgets are free to start.

Block resource-wasting bot traffic with targeted rules

Third-party SEO bots, content scrapers, and AI training crawlers can consume significant server bandwidth. Add specific User-agent blocks for bots like AhrefsBot, SemrushBot, or GPTBot with Disallow: / if you do not want them crawling your site.

Block admin, checkout, and login paths

Paths like /admin/, /checkout/, /cart/, /login/, and /account/ should always be blocked for all bots. These pages add no SEO value, waste crawl budget, and expose internal functionality. Block them globally with User-agent: * and Disallow rules.

Use accordion FAQs to maximize indexed content

Accordion FAQ sections pack keyword-rich content into collapsible panels that search engines can fully index when JavaScript rendering is allowed. Pair open crawl rules for your main content with accordion widgets to build topical authority.

Try Accordion widget →

Organize crawlable content with tabs

Tab widgets keep multiple content sections on a single URL that crawlers can index as one page. This reduces URL sprawl, keeps your crawl budget focused, and gives search engines more content per URL without creating unnecessary duplicate pages.

Try Tabs widget →

Block URL parameters that create duplicate content

E-commerce filter parameters like ?sort=price, ?color=red, and ?page=2 often create thousands of near-duplicate URLs. Block these in robots.txt with Disallow: /*?* or handle them with canonical tags to prevent crawl budget waste on low-value parameterized pages.

Build comparison pages bots can fully index

Comparison table widgets create structured, keyword-rich pages that earn featured snippets and high-intent organic traffic. Make sure comparison page URLs are in your allowed crawl paths and included in your sitemap for fast indexing.

Try Comparison Tables widget →

Keep fresh content crawlable with dynamic feeds

Content feed widgets generate new pages regularly. Ensure feed URLs follow a consistent pattern that your robots.txt allows. Add the feed index pages to your sitemap so search engines discover new content as it is published, not weeks later.

Try Feeds widget →

Audit robots.txt after every major site change

Site redesigns, CMS migrations, and new feature launches often introduce new URL structures. After any major change, review your robots.txt to ensure new paths are correctly allowed or blocked. A stale robots.txt that blocks new pages kills rankings before they ever start.

Technical SEO Glossary

Key crawl and indexing terms explained

Robots.txt is part of a broader technical SEO ecosystem. Here is how the key concepts relate and when each one matters.

Term	Definition	Format / Syntax	When to Use
robots.txt	A plain text file at your site root that communicates crawl permissions to bots. It follows the Robots Exclusion Protocol. Every website should have one.	yourdomain.com/robots.txt	Controlling bot crawl access across your entire site
Crawl Budget	The number of pages Googlebot will crawl on your site within a given timeframe. Sites with large crawl budgets get new content indexed faster. Wasted crawl budget on low-value pages delays indexing of important content.	Crawl capacity x Crawl demand	Optimizing which pages get crawled on large or complex sites
noindex	An HTML meta tag or HTTP header that tells search engines not to include a page in their index. Unlike Disallow, noindex lets bots crawl the page but prevents it from appearing in search results.	<meta name="robots" content="noindex">	Removing specific pages from search results while allowing crawling
XML Sitemap	An XML file listing all the important URLs on your site. It helps search engines discover content faster, especially for large sites or pages with few internal links.	yourdomain.com/sitemap.xml	Ensuring all important pages are discoverable and crawled regularly
Crawl Depth	The number of clicks from the homepage required to reach a given page. Pages buried at depth 4 or deeper receive less crawl frequency than pages near the homepage.	Click depth from homepage	Improving crawl equity distribution by flattening site architecture

From the Blog

Free Robots.txt Generator

Configure Your Robots.txt

Explore More Free SEO Tools

Free Keyword Density Checker

Free Readability Score Calculator

Free Title Length Checker

Free Meta Tag Generator

Free FAQ Schema Generator

Free OG Tag Generator

Free URL Slug Generator

Free Heading Structure Analyzer

Free Word Counter

How to use this free robots.txt generator

Select your user-agent

Add Allow and Disallow rules

Copy and upload to your root

How robots.txt files are structured

Robots.txt directives and what they do

How major bots respond to robots.txt rules

Six robots.txt mistakes that damage your SEO

Blocking your entire site accidentally

Relying on robots.txt for security

Blocking CSS and JavaScript files

Forgetting to add your sitemap

Using Disallow to remove indexed pages

Never testing the file after changes

8 tips to get more from your robots.txt and crawl budget

Block resource-wasting bot traffic with targeted rules

Block admin, checkout, and login paths

Use accordion FAQs to maximize indexed content

Organize crawlable content with tabs

Block URL parameters that create duplicate content

Build comparison pages bots can fully index

Keep fresh content crawlable with dynamic feeds

Audit robots.txt after every major site change

Key crawl and indexing terms explained

Further reading on technical SEO and crawl optimization

Master the Art of SEO: A Comprehensive Guide to Boost Your Website’s Rankings

A Beginner’s Guide to SEO for Developers

The Impact of Readability on SEO and User Engagement

Advanced Techniques for Enhancing Your Instagram Profile's SEO

The Role of Image Compression in SEO and Web Performance

SEO Strategies to Boost Your Squarespace Website Traffic

FAQ

Make every crawlable page worth visiting

Trusted by