Follow this blog:
RSS

Google in search of solution for ‘web spam’ problem

By | February 3, 2011, 10:52 AM PST

Lately, it seems like Google’s search engine has been racking up more complaints than precise search results.

There was a time where the term “googling” was synonymous with finding. But as some companies have figured out how to game Google’s not-so-mysterious-anymore search algorithm, users have increasingly found themselves sifting through a growing number of junk sites to locate what they were looking for.

Many of these low quality junk sites, labeled as “webspam,” are published by content farms such as Demand Media, which publishes the ubiquitous eHow learning pages. The sites are designed to appear higher up in Google’s page rankings by taking full advantage of the search engine’s penchant for certain keyword, phrase, and linking patterns that had once helped to fetch more satisfactory results compared to other search portals. Some sites outright plagiarize content in hopes of having searchers land on their page over the original source.

On TechCrunch, Vivek Wadhwa, the Director of Research at the Center for Entrepreneurship and Research Commercialization at Duke University, describes  how searching for information on Google is become ever more frustrating.

“Google has become a jungle: a tropical paradise for spammers and marketers. Almost every search takes you to websites that want you to click on links that make them money, or to sponsored sites that make Google money.”

In a recent blog post, Google’s principal search engineer Matt Cutts has acknowledged the discontent in cyberland and says that steps are being taken to prevent junk sites from cluttering up searches.

Here are some of the interventions he mentioned that were being integrated into the Google search engine:

  • A redesigned document-level classifier to keep spam from showing up higher in search results by detecting repetitive “spammy” words and phrases that are often automatically generated to drive more traffic.
  • Enhanced detection of hacked sites that have lead to last year’s rise in spam sites.
  • Additional improvements that can better detect sites that copy content and sites with low levels of original content.

Some have suggested that Google’s algorithm be tweaked to favor sites that reliably publish higher quality content. Content farms, however, present a unique dilemma in that some of the content they produce don’t always fall neatly into the category of “spam.” For instance, Demand Media floods the internet with articles, videos and other content created by lowly paid authors that may or may not be providing useful and original material.

Also, implementing a favored sites modification action would likely tilt page rankings in favor of established media outlets over the little guy who just might have something of value to offer. And with search engine optimization turning into such a competitive sport, many dot coms now have a dedicated SEO team to help modify web pages to climb up Google’s page rankings.

Which begs the question: Can programmed search algorithms still be trusted to help us do more finding and less searching?

Start your week smarter with our weekly e-mail newsletter. It's your cheat sheet for good ideas. Get it.

Tuan C. Nguyen

About Tuan C. Nguyen

Tuan C. Nguyen was a contributing editor for SmartPlanet from 2011 to 2013.

Tuan C. Nguyen

Tuan C. Nguyen

Contributing Editor

Tuan C. Nguyen is a freelance science journalist based in New York City. He has written for the U.S. News and World Report, Fox News, MSNBC, ABC News, AOL, Yahoo! News and LiveScience. Formerly, he was reporter and producer for the technology section of ABCNews.com. He holds degrees from the University of California Los Angeles and the City University of New York's Graduate School of Journalism.

Follow him on Twitter.

Tuan C. Nguyen

Tuan C. Nguyen

Tuan C. Nguyen does not hold any investments in the technology companies he covers.

He writes for SmartPlanet and is not an employee of CBS.

If you liked this, don't miss...
13
Comments

Join the conversation!

Follow via:
RSS
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
An algorithm to find things means that an algoritm can be made to circumvent the first one. Statistical analysis works in both directions - reason why the out of 30 results at least 15 (usually the first 5) sites seen on a SERP page for even a 5 word query are link farmed pages and are of no value.
Posted by TAPhilo
4th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
Why bother with google, I've been using Copernic for years and with no real problems.
Nothing good comes free, you get what you pay for is a very apt adage!
Posted by mike@...
4th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
@Mike Yes, that is very true. People always end up using what works. I bounce around 7 different engines depening on what I am looking for.
Posted by TAPhilo
4th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
I hate to say it, but often the webspam sites provide me with the answer I was looking for anyway. It's the non-sponsored commercial sites that want to sell me software when I'm looking for freeware or general advice on how to do something that get to the top of the page and mess up my searches.
Posted by zackers
5th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
My personal algorithm is to skip the first 3 Google hits. That screens most of the spam sites without reducing search quality by much.
Posted by leber70@...
5th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
One thing they could do is create a ranking penalty for sites like ehow.
Posted by donnert
6th Feb 2011
-1 Votes
+ -
RE: Google in search of solution for 'web spam' problem
I haven't faced such a problem at all with Google ever!
Posted by aditi.sharma
8th Feb 2011
-1 Votes
+ -
RE: Google in search of solution for 'web spam' problem
When people learn how to beat the Google system, Google rewards them with high rankings, and those people make money based on the results. Why should those people stop?

People change their behavior based upon simple reward/penalty systems. I'm surprised it took Google this long to figure that out. Well, I'm not *really* surprised.
Posted by bb_apptix
8th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
I bypass eHow hits, having found them to be written by authors who
browse the web rather than by authors who actually know the
material.
Posted by Bellhop
8th Feb 2011
+1 Vote
+ -
RE: Google in search of solution for 'web spam' problem
If anyone is interested in learning more about this issue, read my latest post about companies that in engage in black-hat SEO schemes.
http://www.smartplanet.com/technology/blog/thinking-tech/did-jc-penney-dupe-googles-search-engine/6273/?tag=content;col1
Posted by ReporterTuan
14th Feb 2011
0 Votes
+ -
Two days after Penguin update was released
Two days after Penguin update was released Google prepared a feedback form designed for two categories of users: those who want to report web spam that still ranks highly after the search algorithm change, and those who think that their site got unfairly hit by the update. SEO Lancashire
Posted by georgewallace25
13th Sep
0 Votes
+ -
why web spam widely extist? It appears , it derserves.
It is hard, as you know, why there are so many useless lind or information(at least most people think) in google search reasult? because threre is a huge demand there. The person want to look for some good source, like purchaser want to find good supplier. it makes possible find potential good supplier. for example, www.sangongvalve.com ,as a industry valve website, you might think, it is not information we need if it appears when you look for some good delicious food in china or looking for clothes factory , but why there is some people spent time looking for it. It means their economy value is larger than your search purpose. Though their social value may not be good. But in nowdays , we do everything calculated by economy value.
Posted by lindanlin
Updated - 16th Oct
0 Votes
+ -
RE: Google in search of solution for 'web spam' problem
Very inspiring, the lowest competition but get highest chance in search engine. a great post. Thanks.

effective seo
Posted by janecarol19
Updated - 24th Oct
Join the conversation
Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

Join the SmartPlanet community and join the conversation! Signing up is fast and free. Don't wait -- we want to hear your opinion!