WordPress robots.txt File -- ALL REVEALED by RankYa

WordPress robots.txt File

 In WordPress

Using robots.txt file on your WordPress built site? If not, you should. Here’s all you need to know about using robots.txt file for wordpress CMS.

Google search engine before it indexes your website, it CRAWLS IT using a user-agent called Googlebot.

Other user-agents = search crawlers, web crawlers all mean the same thing. They basically request URL to see whats on that URL

WordPress built site owners can use the robots.txt file to give further instructions about their website to Googlebot. This is called The Robots Exclusion Protocol. Below are 2 additional readings to expand your knowledge about Google and Website optimization.

How to Create & Use robots.txt on WordPress (Video by RankYa)

WordPress robots.txt File Location

By default, when you install WordPress, there is NO robots.txt file on your installation. Although that is the case, WordPress will automatically create a virtual robots.txt file. Hence the confusion when you try to look for such file on your web server. Usually you would look here:

File Manager > public_html > robots.txt file

And then you don’t see anything because there is NO physical robots.txt file. That means, you need to create one like this:

WordPress robots.txt File Example 1

User-agent: *
Disallow:

Just create a text file and call it robots.txt and then place the above code and upload to your web server where your WordPress is installed.

WordPress robots.txt File Example 2:

User-agent: Googlebot
Disallow: /cgi-bin/
Disallow: /wp-admin/$
Disallow: */trackback/$
Disallow: /comments/feed*
Disallow: /wp-login.php?*
Allow: /*.js*
Allow: /*.css*
Allow: /wp-admin/admin-ajax.php
Allow: /wp-admin/admin-ajax.php?action=*
Allow: /wp-content/uploads/*

User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/$
Disallow: */trackback/$
Disallow: /comments/feed*
Disallow: /wp-login.php?*
Allow: /*.js*
Allow: /*.css*
Allow: /wp-admin/admin-ajax.php
Allow: /wp-admin/admin-ajax.php?action=*
Allow: /wp-content/uploads/*

#Sitemap: https://CHANGE/page-sitemap.xml
#Sitemap: https://CHANGE/post-sitemap.xml
#Sitemap: https://CHANGE/product-sitemap.xml

In the above robots.txt file example, all you need to change is the location of your sitemaps, and then remove the #tag as it is used as a comment. Just create a text file and call it robots.txt and then place the above code and upload to your web server where your WordPress is installed (once you change sitemap location)

Download robots.txt File Example

robots.txt (.zip format, simply unzip it)

WordPress robots.txt Not Updating?

Its usually because you are seeing the cached version of your website. You need to clear the cache of your internet browser.

WordPress robots.txt Yoast SEO Plugin?

Yoast SEO plugin for WordPress does create a robots.txt file for you and allows you to modify it from your WordPress dashboard. Simply use the sample codes above and Save changes to robots.txt file.

WordPress robots.txt Plugin?

Do NOT use plugins for using robots.txt file

URL Blocked by robots.txt WordPress

What this means is that when Google tries to crawl your website / web pages, it sees something in the robots.txt file that says “Hey Google you are NOT allowed (hence DISallow) to crawl that URL” causing URL blocked by robots.txt file errors in Google search console.

How to Unblock URL blocked by robots.txt in WordPress

Simply use the instructions shown in WordPress robots.txt File Example 1 above as it will unblock any directives in your robots.txt file

How to Unblock URL blocked by robots.txt in WordPress

search engine crawling and indexing process

shows an illustration of a web page with magnifying glass on it and also a rectangle box representing a search engine

To avoid most issues with Google Search Console robots.txt blocking certain URL’s from your own WordPress built web site, understand the fact that Google’s CRAWLING and INDEXING Process is 2 completely different process. That means, just because you are using robots.txt file does NOT mean you are telling Google to NOT index certain parts of your website. To be able to control Google’s indexing certain parts, you need to use meta tags with noindex option.

<?php
if (is_front_page()) : ?>
<meta name="Googlebot" content="index">
<meta name="robots" content="index">
<!--this is front-->
<?php elseif (is_search()) : ?>
<meta name="Googlebot" content="noindex">
<meta name="robots" content="noindex">
<!--searchqueries now part of the main conditional logic-->
<?php elseif (is_page('samplePageNameYouWantToNOINDEX')) : ?>
<meta name="Googlebot" content="noindex">
<meta name="robots" content="noindex">
<?php elseif (is_singple('sampleBlogPostNameYouWantToNOINDEX')) : ?>
<meta name="Googlebot" content="noindex">
<meta name="robots" content="noindex">
<?php endif; ?>

Path to that file is Login to Web Hosting > File Manager > public_html > wp-content > themes > YourThemeName > header.php

Basically, locate header.php file and find < head > portion and modify above PHP code and then copy paste save, to control indexing of Google for certain parts of your WordPress site.

RankYa WordPress SEO Tip

  • do NOT index tags so use is_tag()
  • do NOT index attachment URL’s is_attachment
  • do NOT index paginated content is_single() && is_paged()
Recommended Posts
Showing 2 comments
  • guoyuguang
    Reply

    Ever since last year, I’ve already subscribed to your youtube channel and learned much about robot.txt things.
    And it really helps me a lot.
    Now I am still facing problems to submit my sitemap to Google. I’ll leave comments if I’ve encountered any further problems.

    • RankYa
      Reply

      Great to hear that RankYa Digital Marketing how to videos has contributed to your knowledge. Furthermore, because you are using Google Maps on your site (I think you need to triple check it as its throwing an error) you will need to use Google Maps API

Leave a Comment

0

Start typing and press Enter to search

WordPress SettingsWooCommerce SEO