Google Dupe Content

Googles Dupe content filter isn’t so cut and dry as people would have you believe, I have seen it happen so many times, an article is released and syndicated around the world a day or too later, but because google’s spiders didn’t see the orginal article ( usually bad seo ;) ), a syndicated site gets the golden original content flag and the the rest get the :

“repeat the search with the omitted results”

a Good example is a Wired article on “The internet’s two largest search engines are begging to get hacked.” this article was written on July. 02, 2005 - but Google indexed on the 4 Jul 2005

the www.crime-research.org syndicated the story and give credit to Wired and a nice little link on July 03, 2005
But, and this is a big but, google indexed them the same day …. way to go www.crime-research.org :)

so let see what happens….

everything looks ok : normal serps

but what about : quoted serps

and finally : the omitted serps

So it’s not just about getting content out there it’s about getting Google to see it first ;)

DaveN

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • Live
  • StumbleUpon
  • Facebook
  • Google
  • Reddit
  • Technorati

10 Comments | Leave a comment »

  1. 1. Brian Turner | July 5th 2005 @ 9:59 pm

    Looks like IranKicks is now winning. :)

  2. 2. Navito UK | July 5th 2005 @ 11:36 pm

    That’s why I run a blog with my site and try to write a new blog post and link to any new site pages I create.

  3. 3. Kim @Neteffects | July 6th 2005 @ 4:02 am

    Why is IranKicks the leading ranking page now?

    It propbably wasnt first?

  4. 4. Dan Thies | July 6th 2005 @ 4:21 am

    Dave, I didn’t know this was news until I saw it on Aaron’s blog… the stuff you think is common knowledge sometimes, eh? Anyway, nice illustration of the problem. Google’s dupe filter is no doubt a product of the same design mentality that created their scheme for handling 302 redirects.

  5. 5. Johann | July 6th 2005 @ 5:26 pm

    That’s an excellent point Kim! There’s more to it than the first to be crawled..

  6. 6. Toren | July 7th 2005 @ 9:57 pm

    Very interesting, if you look at the inbound links to the root URL of Irankicks vs. Crime-research it has roughly 40,000 more links. My first thought is that it is another case of the rich getting richer in Google based on links. If Crime-research was first and it’s on page SEO elements appear better why would Irankicks usurp it in Google?

    However Wired, which of course should be the top listing, has millions more inbound links, so if it was merely a case of the highest ranking site getting the most credit Wired should reign supreme. My question is Wired suffering from an internal dup content penalty? They have three seperate caches of the article in Google’s SERP. Their onpage SEO is not good but neither is Irankicks. Any ideas?

  7. 7. Sally | July 13th 2005 @ 8:35 pm

    I must be on overload. I’ve read duplicate content is okay and I’ve read it’s not okay. Are there specific ‘ifs’ to the okay and not okay situations?

  8. 8. caveman | July 14th 2005 @ 6:54 pm

    What toren said. We’ve got a number of examples we’re looking at right now that involve this sort of filtering, not all related to articles.

    It’s not just about what gets found first. It’s almost more about backlinks and certain kinds of backlinks, IMO.

    And as revealed in this example, G’s selection for “winning site” can change in short order.

    They’ve got to get this sorted out. Loads of quality pages are getting hit by this lately.

  9. 9. JetteroHeller | September 27th 2006 @ 7:47 pm

    I’ve definitely found empirically that Google will dupe out content for static pages, but definitely haven’t gotten the beat yet for syndicated content. In their Enterprise search appliance documentation, they say that their duplicate filter in the search results will dupe out two results that have the same snippet. I’ve been able to see this empirically by doing one search, seeing a page missing in the serps, and then adding a “&filter=0″ to the query string and then getting the page back. In this case, the pages that were similar had totally DIFFERENT content, but had identical meta description and meta keyword tags. So, Google duped them out.

  10. 10. David Kirk | May 29th 2007 @ 9:18 pm

    When a measure becomes a target, it ceases to be a good measure. Isn’t there are argument to say that, if the aim of search engines with their great yet imperfect algorithms is to reward fresh, relevant, useful content, then the best long-term strategy would be to continue to write and publish fresh, relevant and useful content?

    No quick wins, and with less of the alchemy involved SEO companies wouldn’t get as many customers, but why isn’t this the best advice for long term traffic and search engine ranking?

Leave a Reply

required

required, hidden