Secondary analysis

Secondary analysis is performed after the primary analysis. The first step before performing the secondary analysis is to trim out the tag and adapters, because these sequences do not have a biological meaning.

The main aim of secondary analysis is to assemble all the short DNA sequences (also called reads) in order to interpret the sequence data. Before this reassembly, the “raw” reads from the machine are often assessed and filtered for quality to produce the best results. Reads with low Phred quality scores should be removed and adapters should be trimmed out. When the reassembly is performed from scratch without any reference genome, it is referred to as de novo assembly. In contrast, when there is a reference genome available, the process is much simpler because we can just align all the reads to the reference genome.

Normally we would have several reads mapping the same area of the genome; this is often referred to as "read depth". The read depth measures how many times a certain area is covered with different reads. For instance, a read depth of ten implies that there are ten reads mapping on top of each other in the same genomic area.

The next step in the NGS data analysis is called the tertiary analysis.