Transcriptomics technologies: Difference between revisions

Content deleted Content added
Line 268:
==== Quantification ====
[[File:Transcriptomes_heatmap_example.svg|thumb|upright=1.5|''[[Heatmap]] identification of gene co-expression patterns across different samples.'' Each column contains the measurements for gene expression change for a single sample. Relative gene expression is indicated by colour: high-expression (red), median-expression (white) and low-expression (blue). Genes and samples with similar expression profiles can be automatically grouped (left and top trees). Samples may be different individuals, tissues, environments or health conditions. In this example, expression of gene set 1 is high and expression of gene set 2 is low in samples 1, 2, and 3.<ref name="Lowe_2017" /><ref>{{cite journal | vauthors = Gehlenborg N, O'Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Kitano H, Kohlbacher O, Neuweger H, Schneider R, Tenenbaum D, Gavin AC | title = Visualization of omics data for systems biology | language = En | journal = Nature Methods | volume = 7 | issue = 3 Suppl | pages = S56–68 | date = March 2010 | pmid = 20195258 | doi = 10.1038/nmeth.1436 | s2cid = 205419270 }}</ref>]]
Quantification of sequence alignments may be performed at the gene, exon, or transcript level.<ref name="Thind">{{cite journal | vauthors = Thind AS, Monga I, Thakur PK, Kumari P, Dindhoria K, Krzak M, Ranson M, Ashford B| title = Demystifying emerging bulk RNA-Seq applications: the application and utility of bioinformatic methodology | journal = Briefings in Bioinformatics | volume = 22 | issue = 6 | date = Nov 2021 | pmid = 34329375 | doi = 10.1093/bib/bbab259}}</ref><ref name="#24020486" /> Typical outputs include a table of read counts for each feature supplied to the software; for example, for genes in a [[general feature format]] file. Gene and exon read counts may be calculated quite easily using HTSeq, for example.<ref name="#25260700">{{cite journal | vauthors = Anders S, Pyl PT, Huber W | title = HTSeq—a Python framework to work with high-throughput sequencing data | journal = Bioinformatics | volume = 31 | issue = 2 | pages = 166–9 | date = January 2015 | pmid = 25260700 | pmc = 4287950 | doi = 10.1093/bioinformatics/btu638 }}</ref> Quantitation at the transcript level is more complicated and requires probabilistic methods to estimate transcript isoform abundance from short read information; for example, using cufflinks software.<ref name="#20436464" /> Reads that align equally well to multiple locations must be identified and either removed, aligned to one of the possible locations, or aligned to the most probable ___location.
 
Some quantification methods can circumvent the need for an exact alignment of a read to a reference sequence altogether. The kallisto software method combines pseudoalignment and quantification into a single step that runs 2 orders of magnitude faster than contemporary methods such as those used by tophat/cufflinks software, with less computational burden.<ref name="#27043002">{{cite journal | vauthors = Bray NL, Pimentel H, Melsted P, Pachter L | title = Near-optimal probabilistic RNA-seq quantification | journal = Nature Biotechnology | volume = 34 | issue = 5 | pages = 525–7 | date = May 2016 | pmid = 27043002 | doi = 10.1038/nbt.3519 | s2cid = 205282743 }}</ref>