Thinking Tech

Discovering who controls the other half of the Internet

Discovering who controls the other half of the Internet

Posting in Technology

Arbor Networks defined 150 companies controlling 50% of Internet traffic in 2009. Now computer scientist Craig Labovitz is shining a light on the other half of the web.

In 2009 Arbor Networks coined the term “hyper giants” to define the 150 companies that control 50% of all traffic on the web. But what about the Internet’s other half?

Craig Labovitz of DeepField Networks (he was formerly chief scientist at Arbor Networks) decided to investigate and discovered consolidation at another level. According to his initial findings, most of the other 50% of Internet traffic is handled by a different, but similarly shrinking set of hosting, colocation and other cloud service providers. The consolidation trend is everywhere, but there’s also a distinct dividing line. One set of hyper giants supports mainstream Internet activity, and another enables a majority of traffic from adult sites, P2P users and file sharing hosts.

It was commonly assumed that traffic outside the mainstream was widely distributed across the Internet, but Labovitz says that economics is now driving engineering. The Internet is growing into more of a utility, prices are dropping, and cloud providers are becoming increasingly global. While roughly 20% of traffic still comes from a wide range of smaller sources, a full 30% – outside of the defined hyper giants – is being conducted by a small number of players.

For example, ten sites contribute 70% of all Internet file sharing traffic. And four colocation/hosting companies are responsible for 85% of all file sharing activity.

It’s hard to define this secondary layer of Internet traffic. Labovitz is quick to avoid any moral judgments, pointing out that some of this activity is perfectly legitimate. However, its sources are far more difficult to ascertain than those behind mainstream traffic. In his investigation, Labovitz couldn’t rely on standard DNS lookups and Whois queries because a lot of information was being deliberately concealed. He doesn’t go into great detail about his discovery process (a complete research paper is in the works), but he does say he did a lot of number crunching with big data sets.

Labovitz also notes that intentional obfuscation isn’t the only thing going on with web traffic today. As Internet infrastructure continues to consolidate, it oddly becomes harder and harder to pinpoint dependencies, or to understand an entire cyber supply chain. Labovitz calls it the fog of virtualization. When storage, computing and delivery resources are all virtualized and global, companies may end up relying on providers without even being aware of the association.

For the moment, Labovitz is focusing on supply chains that are visible, however. He sees parallel ecosystems growing up between the mainstream Internet and the web sphere of adult, P2P and file sharing activity. This is a digital divide of a different kind – an Internet split into two distinct realms of network infrastructure.

Share this

Mari Silbey

Contributing Editor

Mari Silbey is an independent tech writer based in Washington, D.C. With a background in cable and telecom, she's a contributor to several trade publications, and part of the GigaOM analyst network. She also writes for the long-running digital media blog Zatz Not Funny, and has written for both corporate and association clients focused on broadband networks, mobile apps, and video delivery. She's a graduate of Duke University. Follow her on Twitter. Disclosure