I've been researching issues with duplicate content on a single domain. It's easy to put up a spammy, keyword replacement type site, where each page is pretty much the same. For example, one page would say "Get an American Express Card Today", and another would have the same text, but would instead say "Get Your Visa Card Today". But what do the search engines think of that?
They hate it. You could get away with this type of web site a couple of years ago, but no longer. You have to be a little sneakier now, adding more dynamic content pulled from rss feeds, newsgroups, re-worded and scraped materials, etc... And you might still be found...
Most people agree that onsite dup content will cause G to filter out all the duplicated pages – showing none of them. Some have mentioned that if a certain percentage of a site is considered dup content, it may begin to apply a penalty to the domain. I've read that each page must have at least 12% original content to not be considered span – I've also heard that it's currently a higher percentage than that (but not able to get specifics).
It's hard to find a consensus on anything seo related, but I've found recurring themes, from respected sources.
Posted by tedster - a WebMasterWorld administrator on Jan 3, 2006
"And if the algorithm sees many pages on the same domain that are essentially duplicate, the algo might well smell an attempt at spamming and decide not to show any of those pages."
Posted by caveman - a WebMasterWorld moderator on Sept 25, 2005
"Within a single site, when pages are deemed too similar, G is not throwing out the dups - they're throwing out ALL the similar pages."
If you want to beat the search engines at their own game, you've got to be very smart these days... and willing to experiment until you find the secret sauce. Once you find the secret sauce - it's just a matter of time until you're on their radar and they de-sauce you. But, hopefully you made a million dollars before that happened... ha!
Good luck,
Travis
Labels: seo