High-throughput sequencing refers to how many DNA molecules can be read simultaneously. Technology is now capable of sequencing multiple fragments of DNA simultaneously. Scientists can now read millions of DNA fragments simultaneously and generate more data in a shorter time frame and at lower costs.
This article will discuss the evolution of genome sequencing and what High throughput genome sequencing means today in sequencing methods.
The low-throughput era
2001 was the year that the first draft of the human gene sequence was completed by researchers (Lander and al. 2001; Venter and al. 2001). This was a turning point in biology’s history. The coordinated research effort took over ten years and cost approximately half to one million US dollars (Schloss et al., 2020).
The problem was throughput.
Frederick Sanger introduced a technology called Sanger sequencing in 1977. (Sanger Nicklen, Coulson 1977). This technology revolutionized DNA reading. DNA polymerase is a radioactively or fluorescently labeled process that adds nucleotides to DNA. A sequencing machine then reads these nucleotides.
Sanger sequencing is used to determine the sequences of small DNA fragments that are less than 900 base lengths. Researchers used these to build larger DNA fragments and complete chromosomes to create their draft human genome. Because only one length of DNA can be read simultaneously, this technique is slow-throughput.
We are now in high-throughput sequencing.
A human genome can be sequenced for as low as $1000 by researchers in just one day. A team sequenced the human genome in just five hours and twenty-two minutes (Gorzynski, 2022). This feat set the current world record for the fastest DNA sequencing.
The solution was throughput.
These advancements are largely due to the high throughput of today’s sequencing technologies. For example, short-read sequencing commonly uses the ‘sequencing-by-synthesis’ technique. This reduces the DNA into smaller pieces, each of which is approximately 150 bases. This allows for the simultaneous sequencing of millions upon millions of DNA molecules. The short sequences of DNA can then be reconstructed using computational analyses and a reference human genome.
Another technology uses a different approach to processing. Nanopore, a long-read sequencing technology, now produces individual reads of approximately 500,000 bases. This allows for large data sets to be generated in a short amount of time. The downside is that less DNA can simultaneously be sequenced, which limits the possibilities for multiplexing.
When researchers are interested in the expression and function of all genes within the genome, RNA-seq uses a similar approach. RNA-seq can achieve higher throughput than genome sequencing. Researchers can now sequence thousands or hundreds of RNA samples simultaneously thanks to new library preparation techniques.
The ultra-high-throughput genomic era
New RNA-seq methods, such as Bulk RNA barcoding and sequencing (BRB-seq), allow researchers to sequence thousands of samples simultaneously with very little data loss (Alpern et al., 2019). This technique tags the 3′ end of mRNA molecules using a sample barcode to identify samples after sequencing.
This reduces the amount of sequencing needed, as only the 3′ end mRNA is sequenced. However, it still generates highly accurate results. This dramatically increases the throughput of sequencing experiments and paves the way for the High throughput genome sequencing.