Biocomputing startup DNAnexus announced on Wednesday that it raised $15 million in funding from Google Ventures and TPG Biotech. DNAnexus gives scientists and others the power to run their data sets and services on the cloud, which will play a role in dramatically driving down costs.
Ten years ago, it cost $3 billion to sequence the human genome. Right now, it costs several grand. But eventually, it will cost a thousand dollars. When genetic sequencing gets to be the price of a laptop, it will be accessible to a wider audience, potentially to be used to determine diseases early.
Maintaining DNA databases isn't as expensive as it was before, thanks to cloud services. In February, the US National Center for Biotechnology Information was hit by a budget cut, so a copy of its massive library of genetic experiments and data was given to DNAnexus.
DNAnexus wants to take that data and bring DNA databases to the browser. It will do this with a partnership with Google Cloud Storage, which aims to archive the next-generation sequence data on the Sequence Read Archive. In early 2007, Jim Watson's 454 sequence reads were uploaded to the SRA. It provides the infrastructure the genetics community needs to store data sets of thousands of different organisms.
The concept of genetic testing has certainly gotten more mainstream since I had mine sequenced by 23andMe, Navigenics, and DeCodeMe. Though, when I got sequenced, only a portion of my DNA was revealed. If companies begin to sequence the entire human genome, we're talking hundreds of gigabytes per person.
"I suspect everyone who wants to have their genome sequenced could have it done in the next few years. It depends on the complexity of the social interactions. Even if the price is right, it depends on who else is getting their genome sequenced. If celebrities do it, then it will become a fad and will be accepted more quickly," famed Harvard geneticist George Church previously told me.
If every person gets their genome sequenced, and it becomes part of everyone's medical record, it will become a hundred-billion dollar business, according to GigaOm.
The problem isn't the sequencing. It's storing the data and managing it. The first million human genomes sequenced will require hundreds of pentabytes of storage and hundreds of thousands of CPUs to process the data. The data has to be readable and accessible.
"The DNAnexus SRA website is an example of a 'big data' initiative that benefits from rethinking the interface in a 100% web-enabled world," Eric Morse, head of business development, Google Cloud Storage, said in a statement.
The Google partnership will enable researchers to access genetic information on the Internet.
The same is true for other organisms.
Cloudant and Monstanto signed a biotech data deal to develop a data integration and visualization platform based on Cloudant's technology called BigCouch.
Monstanto will use BigCouch to help farmers increase yield and stress tolerance in corn, soy, and other crops. According to the Cloudant blog:
Cloudant’s BigCouch will be the core, for both storage and analysis of a new, company-wide platform powering a fundamental aspect of a Fortune 500 business: the analysis & identification of new traits & genomic combinations in agricultural crops. The data & reporting interfaces will be used across Monsanto and should be instrumental in the making of key business decisions.
Related on SmartPlanet: