Calcorithmevery number has an answer
πŸ”
All tools β†’
← SEO & Content

Robots.txt Generator

Generate a robots.txt file to control search engine crawling.

User-agent
Crawl delay (seconds)
Disallow paths (one per line)
Allow paths (one per line)
Sitemap URL
robots.txt
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /private/
Disallow: /api/
Sitemap: https://example.com/sitemap.xml

What is robots.txt?

The robots.txt file is a plain text file placed at the root of a website (e.g. https://example.com/robots.txt) that tells search engine crawlers which pages or sections they are allowed or not allowed to access. It is part of the Robots Exclusion Protocol, established in 1994. Every major search engine β€” Google, Bing, Yandex, and others β€” respects robots.txt by convention, though they are not technically required to.

Robots.txt syntax and directives

User-agent: * ← applies to all bots (* = wildcard) Disallow: /admin/ ← block this path for all bots Allow: / ← allow root (overrides broader disallow) Crawl-delay: 10 ← wait 10 seconds between requests Sitemap: https://example.com/sitemap.xml ← tell bots where sitemap is User-agent: Googlebot ← rule just for Google's crawler Disallow: /private/ ← block Google from /private/ User-agent: GPTBot ← OpenAI's training crawler Disallow: / ← block all content from AI training

Common paths to block

/admin/
Admin panels should never be indexed or crawled
/wp-admin/
WordPress admin area β€” block for all crawlers
/private/
Private user data or member-only content
/api/
API endpoints return JSON, not useful for search
/search/
Duplicate search results pages create index bloat
/*.pdf$
Large PDFs consume crawl budget without ranking value
/cart/
Shopping cart pages are not indexable content
/checkout/
Transaction pages must not be indexed

Critical mistakes to avoid

  • Never block CSS and JavaScript. Blocking /wp-content/ or /assets/ prevents Google from rendering your pages correctly, often causing ranking drops.
  • Robots.txt does not prevent indexing β€” it prevents crawling. A page blocked in robots.txt can still be indexed if other sites link to it. Use <meta name="robots" content="noindex"> to prevent indexing.
  • robots.txt is publicly visible. It reveals your site structure to anyone who looks. Do not use it as a security measure β€” protected pages need proper authentication.
  • Test before deploying. An accidental Disallow: / blocks all crawlers from your entire site. Use Google Search Console's robots.txt tester before uploading changes.

Frequently asked questions

What is a robots.txt file?

robots.txt is a plain-text file at the root of your site that tells search-engine crawlers which paths they may or may not request. It is the first file most crawlers check.

Does robots.txt keep a page out of Google?

Not reliably. Disallow stops crawling, but a blocked URL can still be indexed if other sites link to it. To keep a page out of search results, use a noindex meta tag instead.

What does "Disallow: /" mean?

It tells the specified user-agent not to crawl any URL on the site. An empty Disallow (or no Disallow) allows everything.

Should I link my sitemap in robots.txt?

Yes. Adding a Sitemap: line with the full URL of your XML sitemap helps crawlers discover all your pages efficiently.

iFormula / How it works

robots.txt tells search engines which pages to crawl and which to skip. User-agent: * means all bots. Disallow: /admin/ blocks that path. Allow: / allows access. Sitemap: points to your sitemap URL.

Related SEO & Content tools

Keyword Density
Analyze keyword frequency in text
Meta Tag Generator
Generate HTML meta tags for SEO
Sitemap Generator
Generate an XML sitemap for your website to submit to search engines
Prompt Generator
Generate AI prompts for any task
Token Calculator
Estimate AI token count and cost
Prompt Enhancer
Improve and optimize your AI prompts
Schema Markup Generator
Generate JSON-LD structured data for Google rich results
SEO Title Generator
Generate SEO-optimized page titles for higher CTR