Web Marketing: How to Avoid Duplicate Content - dummies

Web Marketing: How to Avoid Duplicate Content

By John Arnold, Michael Becker, Marty Dickinson, Ian Lurie, Elizabeth Marsten

Duplicate content should be avoided in web marketing and Google provides a tool for detecting the repeated information. Nothing hurts a search engine’s quest for relevant content as much as finding the exact same words on two different pages. Duplication is bad for these reasons:

  • Duplication used to be another tactic used to fool the search engines. Webmasters would take one website and replicate it across many different domains, linking them all together. That would fool early search engines into seeing many relevant sites interlinked, and therefore cause the engines to artificially inflate rankings. You don’t want to risk being associated with this tactic — penalties are rare but severe.

  • Duplicate content creates confusion. If a search engine finds the same content on two pages of one site, or two pages on two different sites, it has to basically guess which page should be ranked. Having duplicate words also makes it hard for a search engine to decide which page should be ranked.

  • Other webmasters who link to your content might link to either version. All links to your site are votes. If half of all webmasters link to one page on your site and the other half link to the duplicate, you’ve split your vote and lose authority.

Most of the time, duplication is an accident. It’s created by inconsistent linking, bad pagination scripts, or other sloppy website-building practices.

Google says that it can handle duplicate content for you. It’s true — Google will often remove duplicates from its index, and it doesn’t penalize you for it. The problem is, though, that Google still has to spend time crawling all of that duplicate stuff. That wastes what’s known as crawl budget, and that hurts your SEO.

You can find duplicate content on and off your website by using search engines. To do so, follow these steps:

  1. Go to Google and type site:www.yoursiteaddress.com.

    (Type your actual website address.) Doing this shows all pages from your site that are currently in the Google index.

  2. Click through all the result pages.

    If you get to a message that reads In order to show you the most relevant results, we have omitted some entries very similar. . ., you have pages that Google considers duplicates.

  3. Click Repeat the Search with the Omitted Results Included.

    The additional pages are your duplicates.

You can also find duplicate content with a more basic search. Copy one sentence from somewhere on your website. Make sure that it’s not a sentence that others are likely to use. On Google, search for that sentence, in quotations marks.

The results page for a search of Google.

Google returns all pages in its index that include those words, in that order. This lets you find other sites that have copied your writing as well as pages on your own website that are duplicates of each other.

No matter how hard you try, you can never create a site that is 100 percent duplication free — it’s impossible. The important thing is to make sure that you don’t duplicate entire pages or sets of pages from one page to the next or, even worse, one website to another.