Using 3rd generation DNA sequencing technology, scientists can sequence an entire genome in about fifteen minutes. Incredible, until you take into account the fact that every fifth or sixth DNA “letter” generated is incorrect.
First developed in 2009, 3rd generation or single molecule sequencing generates up to 100 times longer “reads” than past methods, providing a more complete picture of a genome. It was used to analyze the 2010 cholera outbreak in Haiti and the 2011 E. coli outbreak in Germany. But scientists have remained frustrated over its 15% error rate.
Today, collaborators at Cold Spring Harbor Laboratory have announced in Nature Biotechnology a new method that brings that error rate down below 1%. They’ve used their technique to sequence the entire genome of a parrot, which one author boasts in a press release, is “far superior to that of any previously sequenced bird genome.”
How’d they do it? They combined the best aspects of 3rd generation sequencing (long reads but error-ridden) with those of 2nd generation sequencing (ridiculously short reads but accurate). The technique is math-heavy, they run a sequence through the 3rd generation methods, then hybridize it with the short accurate 2nd generation reads.
Right now the 99.9% error-free hybrid reads are only just over twice the length you’d get from 2nd generation sequencing, but that’s expected to increase considerably as single molecule sequencing techniques improve.
Much of the DNA sequencing work underway now has to do with comparative genome analysis, which often compares spontaneously occurring structural changes in genomes between populations. Such research, which has been used to study schizophrenia and autism, relies heavily on the accuracy of sequencing yet is prohibitively expensive when done with 2nd generation technologies. The Cold Harbor team sees their new approach as the answer to these concerns.
Photo: Ross Hawkes/Flickr