Google Dupe Content
- 7th Oct 2005
- Leave a Comment
I have seen a lot of sites get washed recently due to Googles dupe content filters, the most common one is the MOD_Rewrite, yer go figure.. the amount of times I have told people to use a mod_rewrite, was it bad advice ?? well it didn’t used to be, So now I have added this statement to my “use a Mod_Rewrite”..
Ok Mr Matt Cutts .. first what you need to do is get rid of those Urls that look like this:
http://www.mattcutts.com/blog/?p=16 and replace them with
http://www.mattcutts.com/blog/up-up-up-up-up/ .. but Matt.. what you must do is have a robots.txt .. in that robots. txt file add this little line in ..
User-agent: *
Disallow: /?p
I think that’s right … i have never really wanted to stop SE’s spiders getting in before.. also watch out for “PRINT Article” links if they go to the same content with a different CSS.. you could get into trouble with that too.. oh hum
DaveN









5 Comments | Leave a comment »
I *want* to see how well engines handle it, and how to fix it when I notice a problem. :)
IMHO, this is a huge problem right now, there are TONS of scrapper sites out there, targeting most of the times the top SERPs, so it may be that the real solution is to cloack pages and give GBot and yBot and msBot diffrent content then the one we give the user …
> so it may be that the real solution is to cloack pages and give GBot and yBot and msBot diffrent content then the one we give the user …
Thats called “personalization” and is very comon these days :)
Matt! Throw me a bone on fixing OHWY.com which appears to be killed by (unfair!) dupe filtering or we’ll have to hire … DAVE!
JoeDuck/Joe Hunkins
Go get ‘em Dave ;)
This continues to be an issue with blogging software, and needs to be addressed.