With this post about robots.txt rules I'm starting a series of articles about my own WordPress setup, where I'll be sharing my experience and insights. I hope it will be useful for people who are using my WordPress themes and learning their way into the WordPress web development world.
robots.txt rules were constantly changing in relation to the search engine algorithms responsible for Page Rank calculation and behavior of robots/spiders. There are very prominent programmers who recommend to let robots/spiders decide what content they want to crawl, on the other side there are "old school" web developers, who think that it is better to restrict the content that can be crawled on your site. At the moment I adhere to the second group. There are many considerations and I will not cover everything here, just search about the topic on the web and you'll find many great articles to read 😉
robots.txt file is kept in the root folder of your website (and that's how I keep it), in most hosting environments it is
public_html folder. If you are using a plugin for this, you would copy and paste these rules into a plugin field.
Below is the content of
robots.txt file I'm using on all my sites. As you can see there are several lines commented out. That happened pretty recently and was prompted by the facts that Google changed the Page Rank calculation algorithm, making mobile readiness of a website an important factor in achieving higher SEO rank. When I was testing my site, I noticed that robots, responsible for the decision making if site is mobile ready, needed access to theme files in order to render website view. I still leave them in the file, so not to forget that they were here and not to add them back later 🙂
User-agent: * Sitemap: http://mtomas.com/sitemap.xml # disallow all files in these directories Disallow: /cgi-bin/ Disallow: /wp-admin/ # commenting these out to let Google Mobile-Friendly Test robot get theme CSS # Disallow: /wp-content/ # Disallow: /wp-includes/ # Disallow: /wp-* # Disallow: /*?* Disallow: /xmlrpc.php Disallow: *?replytocom Disallow: /comment-page- Disallow: /archives/ Disallow: /author/ Disallow: /category/ Disallow: /date/ Disallow: /feed/ Disallow: /comments/feed/ Disallow: /page/ Disallow: /tag/ Disallow: /trackback/ Disallow: /privacy-policy Disallow: /terms-of-use User-agent: Mediapartners-Google* Allow: / User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Adsbot-Google Allow: / User-agent: Googlebot-Mobile Allow: /