3) Next Generation Sequencing (NGS) – Coverage & Sample Quality Control


Next Generation Sequencing (NGS) is a useful
tool in determining the DNA sequence, information which is valuable in furthering our understanding
of biological processes. Compared to other tools, NGS is flexible and it can be used
in different applications, including sequencing the exome and methylated cytosines in the
genome. However, there are parameters to be considered prior to running an NGS experiment.
For all your NGS experimental needs, Applied Biological Materials, or abm, offers a wide
range of affordable services, including whole genome sequencing, RNA sequencing, exome sequencing, and even lane rentals. In our previous video, we discussed how to
prepare samples for sequencing. Here, we will outline the quantitative and qualitative measures
needed to ensure a good sample for sequencing. We invite you to watch our previous video
before starting this one. Although the current NGS platforms available
on the market are very accurate, they are still prone to error. Even at accuracies of
99% and greater, a sequence generated may contain incorrect nucleotides. This means
that if a machine’s accuracy is 99%, one base pair is read incorrectly out of 100.
Since NGS platforms generate high amounts of output, these errors can add up quickly.
The way to circumvent NGS platform limitations is to sequence nucleotides multiple times.
The number of times a nucleotide is sequenced is referred to as “coverage” or “depth.”
Coverage per genome can be calculated by dividing the total output generated in a sequencing
run by the total size of the sample sequence. For example, running a human genome, which
is approximately 3 billion base pairs, will yield approximately 333X coverage on the Illumina
High Seq 2500, which has a max output of 1000 giga-base pairs. Consult abm’s NGS experts
for the appropriate coverage of your experiment here. For detailed information on the appropriate
coverage required for different NGS applications, please view our knowledge base here.
Prior to sequencing, the sample library must be validated quantitatively and qualitatively
to verify if there is a sufficient amount of good quality DNA. What is considered a
good quantity of DNA depends on the library protocol’s specifications. Having either
more or less DNA results in less efficient sequencing reaction runs. This generates low
quality data due to read problems from flow cell saturation, or reduced coverage because
of insufficient DNA. In terms of quality, a good quality library is one that has a diverse
set of DNA fragments with minimal duplicate fragments. This is important because during
PCR amplification, duplicates of fragments will be generated. The consequence of duplicate
fragments is that the sequencing reaction will be biased towards these fragments. Rather
than having a wide range of fragments sequenced, the same fragments are sequenced repetitively,
resulting in overrepresentation in the machine output.
Library quantification is performed using either qPCR or a fluorometric method like
Qubit. Some libraries may only be quantified using one of the two methods. Sample library
quality is then verified with the Bioanalyzer. qPCR is a method of quantifying a sample library
before sequencing. It is ideal when there is an insufficient amount available for fluorometric
quantification, commonly due to no PCR amplification. It is also a more sensitive way, relative
to Qubit, to quantify the adapter-ligated fragments in a sample. qPCR selectively amplifies
such fragments, so it avoids the inaccuracies of Qubit that result from being unable to
distinguish between fragments which can and cannot be sequenced. The only drawback to
this procedure is that it is very time-consuming. Qubit is an alternative to qPCR for quantifying
a sample library. Relative to qPCR, it provides results faster; however, it is not applicable
for cases where there is no PCR enrichment as it is less sensitive than qPCR and requires
more sample. Quantification is performed by mixing the sample, which may be diluted, with
the appropriate dye, which is then illuminated and detected by the machine. Note that a standard
must be measured with the appropriate assay prior to sample quantification.
The Bioanalyzer is used to check the size distribution of the library before the sequencing
reaction, including whether the sizes selected during sample library preparation are present.
The Bioanalyzer is a machine that reads gel chips containing samples in the wells. The
chips are similar to agarose gels, except in a smaller format. The protocols for DNA
and RNA are similar to each other. The first step is to introduce the gel into the chip
and pressurize it; this will evenly distribute the gel, minimizing errors in machine analysis
later on. Once complete, markers, ladders, and samples (either diluted or undiluted)
are loaded onto the chip. There may be additional reagents needed depending on the kit requirements.
The chip is then vortexed before it is loaded onto the Bioanalyzer. The machine will monitor
each well for sample. This is visualized with peaks on a graph. The location of the peaks
will indicate the markers and the sample size distribution of the library, while the peak
height shows the amount of fragments at a specific size. As part of abm’s NGS service,
we perform quality control all sample libraries to ensure valid sequencing data are delivered
to our customers. Please leave your questions and comments below
and we will answer them as soon as possible. Thank you for watching!

6 comments

I have an assignment(experiment to perform next generation sequencing on 10 patients with a healthy liver and 10 patients with a liver that has a genetic disorder and write the difference between their gene expression.and what will be the outcome of next generation sequencingDon't know where to start

How can we relate the protein fragments in a cell with the DNA sequencing? If i have protein expressed under some stress in plant..how can i know the RNA sequence or DNA sequence related to the protein..

Very informative video.
It would be great if you could please explain which factors affect the coverage during sequencing and also how?

Leave a Reply