MOUNTAIN VIEW, Calif.-- Last week, I stopped by Alon Halevy's office to learn how to visualize large data sets onto a map.
Halevy, who heads the Data Group at Google Research, walked me through the data platform:
The problem with data is that it's ugly to look at. Nobody wants to look at an Excel spreadsheet to try to figure out the trends. Not to mention, in the age of Google docs and our Facebook sharing nature, collaborating with people on big data sets isn't that easy to do.
That's the idea behind Google Fusion Tables, a cloud-based data management system that was launched in June 2009. It was originally designed for organizations who wanted to make their data available online, so companies could share their data internally or externally. Users upload their data files in various formats such as a spreadsheet or Comma Separated Values and can upload data sets up to 100MB.
Here are some applications of the tool:
- ecologists in Costa Rica who want to maintain records of specimens and include genetic information produced by a lab in Canada
- non-profit that wants to publish data about water resources
- The International Coffee Organization collects and shares information about coffee exports and imports
- an epidemiologist who wants to illustrate disease trends
Halevy explains how to use the tool and encourages more people and corporations to share their data.
The Guardian's Simon Rogers is a power user, who has done a great job of using data visualizations to tell stories based on data.
New Scientist's Peter Aldhous used Google Fusion Tables to map hydrothermal vent fields:
Aldhous also used Google Fusion Tables and Tableau to get a better idea of what is going on with tornadoes. It's clear that 2011 has been deadly, if you look at what's been going on historically. Click here to interact with the map.
Google Fusion Tables is a good for mapping and for collaborating with others. But if a customer or business wants more help, a service like Tableau Software would provide a more hands-on experience. Flowing Data is a blog that features visualizations from around the web.
Chris Anderson wrote in Wired about the end of theory, saying that sensors and access to the cloud will give us access to data on the petabyte scale.
True. The amount of data is exploding. "15 out of 17 sectors in the United States have more data stored per company than the US Library of Congress," according to a recent McKinsey report. "Companies and policy makers must tackle significant hurdles to fully capture big data's potential."
It's what you do with the data that matters. As tools to handle and make sense of the data improve, it will be easier to spot disease trends, identify business trends and clarify current affairs such as mapping 186 medical marijuana dispensary locations in Los Angeles.
Related on SmartPlanet: