Duplicate Content & Google: 9 Tips to Help Marketers Speed Up Spider Crawls
SES New York 2011 was almost over at 2 p.m. on March 24. The Exhibit Hall was already dismantled. Hallways were nearly empty between the remaining sessions. Then conferees walked into a crowded session on Google SEM, "Duplicate Content and Multiple Site Issues."
Those lucky enough to find a seat endured elbow and leg bumps as everyone jostled to pull out notepads and pens to copy down the tips and tricks related by panelists and Moderator Anne F. Kennedy, an international search strategist with Portland, Maine-based search engine optimization consulting firm Beyond Ink.
Kennedy noted that she was a little disappointed in the session's popularity, considering the "messes" should already be gone from marketers' sites. After all, she says, "We've been talking about it for nine years."
Panelists for the session were:
- Eric Enge, president of Boston-based Stone Temple Consulting;
- Tiffany Oberoi, software engineer on the Search Quality Team at Google; and
- Brian Ussery, chief technical officer for Atlanta-based search engine marketing and Web analytics consultancy Search Discovery Inc.
They suggested marketers take the following steps to reduce duplicate content and multiple site issues in order to improve their search engine marketing results:
1. Understand that Google doesn't penalize duplicate content, it filters it. Oberoi says she wants to dispel that penalty myth. Addressing the audience, she says "the majority of you" probably have "innocent" questions about the subject. "Most people aren't aware that they're non-maliciously duplicating content," Ussery agrees, while using slides from GoogleStore.com as an example.
What duplication does do, Ussery says, is thin PageRank and affect keywords by using them repeatedly. So, in addition to providing consumers with a bad user experience, duplicating content also slows search engine spider crawls.
2. Add value. Rather than using the manufacturer's language when describing a product, Oberoi says, resellers should differentiate themselves by providing added value to the content on their sites. "Otherwise, no one's going to go there."
Ussery illustrated the point with slides showing a product description from Nike, for which Nike.com ranked first and about 43,500 results followed—mostly from shoe resellers. Panelists agreed that Google is trying to ensure that a content author's words rank highest.
3. Differentiate pages. Enge suggests those resellers could differentiate their pages, for instance, by soliciting user-generated content.
4. Watch for tool-related duplicate content, such as shoes sorted by new arrivals vs. lowest price creating the same page, Enge says. Collapse those into one page, he says.
5. Avoid shingle and synonym duplicates. Database solutions can create nearly identical pages, which are viewed in search as duplicates, says Enge. Shingles are groupings of identical words, which may have variations such as "Boston" in one version and "New York" in another when describing wonderful cities. Synonyms would call New York "wonderful" in one version and "fabulous" in another.
6. Country matters. If international marketers have duplicate content, Oberoi says, Google does notice top level domain differentiation—such as .de and .cn. However, she says content providers can further differentiate the pages by using Webmaster Tools to set geographies.
7. Find your own duplicates. To determine if there's duplicate content on a site, visit pages by typing them in the browser with and without "www," Oberoi says. If there's duplicate content, a page will be redirected, which means there are two versions of a page and one should be collapsed into the other.
Ussery also believes marketers should check cached versions of their pages and see if the URL is theirs. If not, it means it's requiring Google to redirect to another URL.
To collapse duplicate pages into one page, perform a 301 redirect for the old link, Oberoi says, the "link juice" will get transferred.
8. Canonical redirects. If there's no access to the server, use rel="canonical" redirects, suggests Oberoi. The canonical page is the preferred version, often the original version, of a page. But avoid creating an infinite loop pointing to a blank page or redirecting all old pages to the homepage, she says. Performing these redirects will increase crawl efficiency by 56 percent to 60 percent.
9. Wait to syndicate. When syndicating content, Ussery suggests waiting three to six hours to send it out after placing it on the proprietary site. Or slightly change the wording on the distributed content. Otherwise, he says, focusing on editor buy-in could mean that editor writes an original story and that doesn't create a duplicate content issue.