Web Marketing: How to Manage Search Engine Robots

By John Arnold, Michael Becker, Marty Dickinson, Ian Lurie, Elizabeth Marsten

In web marketing, it’s important to make sure your site is visible. Checking your robots.txt file and meta robots tags is a good place to start. Errors with these can make your site invisible to visitors.

Check your robots.txt file

Go to www.yoursiteaddress.com/robots.txt. You might get a Page Not Found error. That’s okay for your purposes: A no robots.txt file means you’re not placing any broad limits on what search engines can and can’t index on your site. You might also see something like this:

User-agent: *
Disallow: /blog.htm

This file is called the robots.txt file. It tells search engine crawlers, also known as robots, what to do when they visit your website. In this example, it’s telling all search engines to ignore the blog.htm page. All other pages are searchable.

If you want to become a robots.txt geek, visit the Web Robots Page. You can find out everything you ever wanted to know about guiding robots around your site.

What you don’t want to see in your robots.txt file is this:

Disallow: /

This line tells a visiting search engine crawler to ignore every page on your website. A developer may add this line when he is building the site to prevent search engines from crawling it while it’s under construction. If it is left there by accident, your site is invisible to search engines.

If your robots.txt file has any Disallow commands in it, check with your webmaster or developer to make sure that a reason exists. Disallow can be used to hide pages that change a lot, hide duplicate content, or keep search engines out of stuff you just don’t want them crawling. Just make sure that you’re not accidentally hiding content they should see.

Check for meta robots tags

Using the meta robots tag is another way to hide pages from search engines. Go to any page on your website and view the source code. You don’t want to see

<meta name="robots" content="noindex,nofollow">

If the meta robots tag is there, and it contains noindex, nofollow, or both, remove it. You have valid reasons to use this tag: You might want a search engine to ignore this page because it’s a duplicate; you might feel the information on the page is inappropriate for search results; or the developer might have hidden the page during development. If you don’t know the reason, delete the tag.

Do not trust your developer to remove the meta robots tag. When he builds your site, he’s working hard, writing code so fast that his fingers smoke. Forgetting to remove that one little line of code is easy when you’re facing a tough deadline and still have 4,000 lines of code to write. Remind your developer!