Pinus taeda, or more commonly the loblolly pine, is quite an intriguing organism. It's found in Southeastern United States and is commercially grown in extensive plantations due to its high value in the lumber trade. A new genetic study has revealed that it has a genome even bigger than that of humans - indeed, larger than any other genome examined so far.
Traditional DNA sequencing involves a technique developed by Frederick Sanger in 1977 called, not surprisingly, Sanger sequencing. Although this has proved incredibly useful over the years, it can only be used on small DNA sequences of around 100 to 1000 base pairs, which are the building blocks of DNA. To put that into perspective, the human genome has over 3 billion of those. Sequencing of multicellular organisms has demonstrated that the size of the genome does not necessarily dictate the complexity of the organism.
But there is a solution to the limits imposed by Sanger sequencing; shotgun sequencing. This is where big strands of DNA are broken up into random, smaller and more easy to handle fragments. These fragments can then be sequenced using Sanger sequencing, which is also called chain-termination. After several repeat rounds of this fragmentation and sequencing, a computer will identify overlapping sequences and use this to stitch together a single continuous sequence. But the loblolly pine's sequence was so unwieldy due to the number of repetitive sequences that programs struggle to order properly, that even this method couldn't quite cut it. So what the scientists did was modify the shotgun approach by first generating clones which they organised, mapped and then sequenced.
The results were published in the journal Genome Biology, and it was found that the loblolly pine genome contains 22.18 billion base pairs; that is more than seven times the number found in the human genome. 82% of this was, however, found to be repetitive sequences, in comparison to only 25% in the human genome. Not only that, but they identified numerous genes with important functions, such as disease resistance, which could be important in the lumber industry.
This novel and effective approach can now be applied to sequence other complex organisms, increasing our knowledge base even more.