
When conducting a site audit, the low-hanging fruit for most good SEO folks is identifying duplicate content issues. On-page factors and link popularity can almost always use some improvement, but they’re not usually the kinds of things that actually do harm to your search engine rankings. Duplicate content, on the other hand, can actually hurt a web page that might otherwise have a good chance at a high ranking.
The Problem
Duplicate content issues arise when multiple URLs point to the same content. Consider the following URLs:
http://www.fec.gov/index.shtml
They all point to the home page of the Federal Election Commission, but they’re all different. So why’s that a problem?
A Political Analogy
Imagine an election where there are 1,000,000 registered voters – 500,000 Republicans and 500,000 Democrats. Three candidates – one Democrat, one Republican, and one right-leaning Independent – are running in the election. If you’re a partisan Republican, the obvious problem here is that the right-leaning Independent is likely going to split your party’s vote and the Democrat will have a better chance of winning the election.
The ranking of your web pages works in a similar way. Every time an external web page links to one of your web pages, it’s like a vote for that page. To rank high in search engines, you want as many votes for your page as possible. Since search engines view each URL as a separate page, the last thing you want to do is split the votes intended for a specific page among multiple URLs. This is essentially the problem of duplicate content: multiple URLs pointing to the same content.
Don’t let duplicate content split the vote for your pages!
The Fix: Canonical URLs
The non-technical explanation: decide on one standard URL for every page of your site and make sure that’s the one you always use. Also make sure any non-standard versions get redirected to the standard version.
The technical explanation: the standard URL you select is called a canonical URL. That’s a good term to whip out if you’re ever hiring an SEO person, but don’t ever use that term around anyone else or you’ll sound like an ass. Trust me.
Upon selecting a canonical URL for a given page of content, redirect any variation of that URL to the canonical version and issue a 301 header response (moved permanently).
For a great technical resource on how to do this, check out Steven Hargrove’s post, How to Redirect a Web Page, The Smart Way
Google’s Easier Fix
Recently Google announced a new way to specify your canonical URL, which will be supported by most major search engines. Now you can simply add a
tag to the head section of your web pages to specify the exact URL you wish to have indexed for a given page. To fix the FEC’s home page above, they might simply add the following:
<link rel=”canonical” href=”http://www.fec.gov/” />
Although I’m sure this feature does a good job at getting rid of duplicate content issues, I’m skeptical about whether it will pass link value as well as the 301 redirect does. I’ll be testing this soon, so I look forward to reporting back on its effectiveness.
Canonical Link Element Resources
If you’re interested in finding out more about the Canonical Link Element, you’ll find great articles at Conversation Marketing and SEOMoz. Also check out this video from Matt Cutts at SMX West:
