Base calling is the process of assigning nucleobases to chromatogram peaks, light intensity signals, or electrical current changes resulting from nucleotides passing through a nanopore. One computer program for accomplishing this job is Phred, which was a widely used base calling software program by both academic and commercial DNA sequencing laboratories because of its high base calling accuracy.[1]

Currently basecalling is commonly handled by on-instrument software, such as the proprietary Real-Time Analysis (RTA) pipeline, which is highly integrated and updated with each platform release.[2]

Base callers for Nanopore sequencing like Guppy or Dorado, use neural networks trained on current signals obtained from accurate sequencing data. [3]

Base calling accuracy

edit

Base calling can be assessed by two metrics, read accuracy and consensus accuracy. Read accuracy refers to the called base's accuracy to a known reference. Consensus accuracy refers to how accurate a consensus sequence is compared to overlapping reads from the same genetic locus. [3]

References

edit
  1. ^ Richterich, Peter (1998-03-01). "Estimation of Errors in "Raw" DNA Sequences: A Validation Study". Genome Research. 8 (3). Cold Spring Harbor Laboratory: 251–259. doi:10.1101/gr.8.3.251. ISSN 1088-9051. PMC 310698. PMID 9521928.
  2. ^ "Real Time Analysis (RTA) on NextSeq 1000/2000 Overview". illumina. Retrieved 2025-08-17.
  3. ^ a b Wick, Ryan R.; Judd, Louise M.; Holt, Kathryn E. (2019-06-24). "Performance of neural network basecalling tools for Oxford Nanopore sequencing". Genome Biology. 20 (1). Springer Science and Business Media LLC: 129. doi:10.1186/s13059-019-1727-y. ISSN 1474-760X. PMC 6591954. PMID 31234903.