Primary analysis
Primary analysis includes all the steps required to "call" - identify - each base. Besides identifying the bases, the sequencing machine will also assign a quality score for each of the bases.
The outcome is stored as a FASTQ file (see image), containing the sequence identifiers, the assigned nucleotides (A, G, T or C) which are also called "reads", and the associated Phred quality score. When the character "N" is associated to a nucleotide, it implies that the machine cannot determine the exact nucleotide. The Phred quality score refers to the probability of an incorrect base calling. In a FASTQ file, the Phred quality score is stored as an ASCII character (a letter, a digit or a symbol), which ASCII value will indicate the accuracy of the base calling.
The primary analysis is typically automatically performed in the sequencing machine after each run.
If you want to sequence several samples together in one run (for example from different patients or different experiments), you can assign a specific tag to each of them. The Tag (also known as the barcode) is a short DNA sequence that is added to the adapter to differentiate the reads from each sample. This tag will also be sequenced, and by identifying the specific adapter sequence for each sample, you will be able to separate them from each other. This is also called multiplexing and has the added advantage of lowering the sequence cost and producing larger samples.
Example of a FASTQ file result of the NGS.
The next step in the NGS data analysis is called the secondary analysis.