Content deleted Content added
Tuankiet65 (talk | contribs) mNo edit summary |
Citation bot (talk | contribs) Add: pmid, s2cid. | You can use this bot yourself. Report bugs here. | Suggested by Abductive | Category:Gene expression | via #UCB_Category |
||
Line 33:
|-
| [[Comparative genomic hybridization]]
| Assessing genome content in different cells or closely related organisms, as originally described by [[Patrick O. Brown|Patrick Brown]], Jonathan Pollack, [[Ash Alizadeh]] and colleagues at [[Stanford University|Stanford]].<ref name="Pollack et al.">{{cite journal|author=Pollack JR|author2=Perou CM|author3=Alizadeh AA|author4=Eisen MB|author5=Pergamenschikov A|author6=Williams CF|author7=Jeffrey SS|author8=Botstein D|author9=Brown PO|date= 1999|title=Genome-wide analysis of DNA copy-number changes using cDNA microarrays|journal=Nat Genet|volume=23|pages=41–46|pmid=10471496|doi=10.1038/12640|issue=1|s2cid=997032}}</ref><ref name="Moran et al.">{{cite journal|author=Moran G|author2=Stokes C|author3=Thewes S|author4=Hube B|author5=Coleman DC|author6=Sullivan D|date= 2004|title=Comparative genomics using Candida albicans DNA microarrays reveals absence and divergence of virulence-associated genes in Candida dubliniensis|journal=Microbiology|volume=150|pages=3363–3382|pmid=15470115|doi=10.1099/mic.0.27221-0|issue=Pt 10|doi-access=free}}</ref>
|-
| GeneID
Line 45:
|-
| [[SNP array|SNP detection]]
| Identifying [[single nucleotide polymorphism]] among [[alleles]] within or between populations.<ref name="Hacia et al.">{{cite journal |author=Hacia JG|author2=Fan JB|author3=Ryder O|author4= Jin L|author5=Edgemon K|author6=Ghandour G|author7=Mayer RA|author8= Sun B|author9=Hsie L|author10=Robbins CM|author11=Brody LC|author12=Wang D|author13=Lander ES|author14=Lipshutz R|author15=Fodor SP|author16=Collins FS|date= 1999|title=Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays|journal=Nat Genet|volume=22|pages=164–167|pmid=10369258 | doi = 10.1038/9674|issue=2|s2cid=41718227}}</ref> Several applications of microarrays make use of SNP detection, including [[genotyping]], [[forensic]] analysis, measuring [[Genetic predisposition|predisposition]] to disease, identifying drug-candidates, evaluating [[germline]] mutations in individuals or [[Somatic (biology)|somatic]] mutations in cancers, assessing [[loss of heterozygosity]], or [[genetic linkage]] analysis.
|-
| [[Alternative splicing]] detection
Line 57:
|-
|Double-stranded B-DNA microarrays
|Right-handed double-stranded B-DNA microarrays can be used to characterize novel drugs and biologicals that can be employed to bind specific regions of immobilized, intact, double-stranded DNA. This approach can be used to inhibit gene expression.<ref name="Gagna 895–914">{{Cite journal|title = Novel multistranded, alternative, plasmid and helical transitional DNA and RNA microarrays: implications for therapeutics|journal = Pharmacogenomics|date = 2009-05-01|issn = 1744-8042|pmid = 19450135|pages = 895–914|volume = 10|issue = 5|doi = 10.2217/pgs.09.27|first1 = Claude E.|last1 = Gagna|first2 = W. Clark|last2 = Lambert}}</ref><ref name="Gagna 381–401">{{Cite journal|title = Cell biology, chemogenomics and chemoproteomics - application to drug discovery|journal = Expert Opinion on Drug Discovery|date = 2007-03-01|issn = 1746-0441|pmid = 23484648|pages = 381–401|volume = 2|issue = 3|doi = 10.1517/17460441.2.3.381|first1 = Claude E.|last1 = Gagna|first2 = W.|last2 = Clark Lambert|s2cid = 41959328}}</ref> They also allow for characterization of their structure under different environmental conditions.
|-
|Double-stranded Z-DNA microarrays
Line 81:
In ''spotted microarrays'', the probes are [[oligonucleotide synthesis|oligonucleotide]]s, [[cDNA]] or small fragments of [[PCR]] products that correspond to [[mRNA]]s. The probes are [[oligonucleotide synthesis|synthesized]] prior to deposition on the array surface and are then "spotted" onto glass. A common approach utilizes an array of fine pins or needles controlled by a robotic arm that is dipped into wells containing DNA probes and then depositing each probe at designated locations on the array surface. The resulting "grid" of probes represents the nucleic acid profiles of the prepared probes and is ready to receive complementary cDNA or cRNA "targets" derived from experimental or clinical samples.
This technique is used by research scientists around the world to produce "in-house" printed microarrays from their own labs. These arrays may be easily customized for each experiment, because researchers can choose the probes and printing locations on the arrays, synthesize the probes in their own lab (or collaborating facility), and spot the arrays. They can then generate their own labeled samples for hybridization, hybridize the samples to the array, and finally scan the arrays with their own equipment. This provides a relatively low-cost microarray that may be customized for each study, and avoids the costs of purchasing often more expensive commercial arrays that may represent vast numbers of genes that are not of interest to the investigator.
Publications exist which indicate in-house spotted microarrays may not provide the same level of sensitivity compared to commercial oligonucleotide arrays,<ref name="TRC Standardization">{{cite journal |date=2005 |title=Standardizing global gene expression analysis between laboratories and across platforms |journal=Nat Methods |volume=2 |pages=351–356 |pmid=15846362 |doi=10.1038/nmeth754 |last12=Deng |first12=S |last13=Dressman |first13=HK |last14=Fannin |first14=RD |last15=Farin |first15=FM |last16=Freedman |first16=JH |last17=Fry |first17=RC |last18=Harper |first18=A |last19=Humble |first19=MC |last20=Hurban |first20=P |last21=Kavanagh |first21=TJ |last22=Kaufmann |first22=WK |first23=KF |first24=L |first25=JA |first26=MR |last27=Li |first27=J |first28=YJ |last29=Lobenhofer |first29=EK |last30=Lu |last31=Malek |first31=RL |last32=Milton |first32=S |last33=Nagalla |first33=SR |last34=O'malley |first34=JP |last35=Palmer |first35=VS |last36=Pattee |first36=P |last7=Paules |first7=RS |last38=Perou |first38=CM |last9=Phillips |first39=K |last40=Qin |last41=Qiu |first41=Y |last42=Quigley |first42=SD |last43=Rodland |first43=M |last44=Rusyn |first44=I |last45=Samson |first45= LD|last46= Schwartz|last47=Shi |first47=Y |last48=Shin |last49=Sieber |last50=Slifer |last51=Speer |first51=MC |last52=Spencer |first52=PS |last53=Sproles |first53=DI |last54=Swenberg |first54=JA |last55=Suk|first55= WA |last56=Sullivan |first56=RC |last57=Tian |first57=R |last58=Tennant |first58=RW |last59= Todd |first59=SA |last60=Tucker |first60=CJ |last61=Van Houten |first61=B |last62=Weis |first62=BK |last63=Xuan |first63=S |last64=Zarbl |first64=H |last65=Members Of The Toxicogenomics Research |first65=Consortium |issue=5 |author1=Bammler T, Beyer RP |author2=Consortium, Members of the Toxicogenomics Research |last3=Kerr |last4=Jing |last5=Lapidus |last6=Lasarev |last8=Li |first3=X |first4=LX |first6=DA |first8=JL |first9=SO |first5=S |s2cid=195368323 }}</ref> possibly owing to the small batch sizes and reduced printing efficiencies when compared to industrial manufactures of oligo arrays.
In ''oligonucleotide microarrays'', the probes are short sequences designed to match parts of the sequence of known or predicted [[open reading frame]]s. Although oligonucleotide probes are often used in "spotted" microarrays, the term "oligonucleotide array" most often refers to a specific technique of manufacturing. Oligonucleotide arrays are produced by printing short oligonucleotide sequences designed to represent a single gene or family of gene splice-variants by [[oligonucleotide synthesis|synthesizing]] this sequence directly onto the array surface instead of depositing intact sequences. Sequences may be longer (60-mer probes such as the [[Agilent]] design) or shorter (25-mer probes produced by [[Affymetrix]]) depending on the desired purpose; longer probes are more specific to individual target genes, shorter probes may be spotted in higher density across the array and are cheaper to manufacture.
Line 157:
Due to the biological complexity of gene expression, the considerations of experimental design that are discussed in the [[expression profiling]] article are of critical importance if statistically and biologically valid conclusions are to be drawn from the data.
There are three main elements to consider when designing a microarray experiment. First, replication of the biological samples is essential for drawing conclusions from the experiment. Second, technical replicates (two RNA samples obtained from each experimental unit) help to ensure precision and allow for testing differences within treatment groups. The biological replicates include independent RNA extractions and technical replicates may be two [[wikt:Special:Search/aliquot|aliquots]] of the same extraction. Third, spots of each cDNA clone or oligonucleotide are present as replicates (at least duplicates) on the microarray slide, to provide a measure of technical precision in each hybridization. It is critical that information about the sample preparation and handling is discussed, in order to help identify the independent units in the experiment and to avoid inflated estimates of [[statistical significance]].<ref>{{cite journal |title=Fundamentals of experimental design for cDNA microarrays | journal=Nature Genetics |series=supplement |volume=32 |date=2002 | doi=10.1038/ng1031 |url=http://www.vmrf.org/research-websites/gcf/Forms/Churchill.pdf |pages=490–5 |format=– <sup>[https://scholar.google.co.uk/scholar?hl=en&lr=&q=intitle%3AFundamentals+of+experimental+design+for+cDNA+microarrays&as_publication=Nature+genetics+supplement&as_ylo=2002&as_yhi=2002&btnG=Search Scholar search]</sup> |pmid=12454643 |last1=Churchill |first1=GA | s2cid=15412245 |url-status=dead |archiveurl=https://web.archive.org/web/20050508225647/http://www.vmrf.org/research-websites/gcf/Forms/Churchill.pdf |archivedate=2005-05-08 |accessdate=12 December 2013}}</ref>
=== Standardization ===
Line 176:
|date= 2013|title=Classification Analysis of DNA Microarrays|publisher=John Wiley and Sons|isbn=978-0-470-17081-6|url=http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470170816.html}}</ref> This type of approach is not hypothesis-driven, but rather is based on iterative pattern recognition or statistical learning methods to find an "optimal" number of clusters in the data. Examples of unsupervised analyses methods include self-organizing maps, neural gas, k-means cluster analyses,<ref>De Souto M et al. (2008) Clustering cancer gene expression data: a comparative study, BMC Bioinformatics, 9(497).</ref> hierarchical cluster analysis, Genomic Signal Processing based clustering<ref>Istepanian R, Sungoor A, Nebel J-C (2011) Comparative Analysis of Genomic Signal Processing for Microarray data Clustering, IEEE Transactions on NanoBioscience, 10(4): 225-238.</ref> and model-based cluster analysis. For some of these methods the user also has to define a distance measure between pairs of objects. Although the Pearson correlation coefficient is usually employed, several other measures have been proposed and evaluated in the literature.<ref>{{cite journal|last1=Jaskowiak|first1=Pablo A|last2=Campello|first2=Ricardo JGB|last3=Costa|first3=Ivan G|title=On the selection of appropriate distances for gene expression data clustering|journal=BMC Bioinformatics|volume=15|issue=Suppl 2|pages=S2|doi=10.1186/1471-2105-15-S2-S2|pmid=24564555|pmc=4072854|year=2014}}</ref> The input data used in class discovery analyses are commonly based on lists of genes having high informativeness (low noise) based on low values of the coefficient of variation or high values of Shannon entropy, etc. The determination of the most likely or optimal number of clusters obtained from an unsupervised analysis is called cluster validity. Some commonly used metrics for cluster validity are the silhouette index, Davies-Bouldin index,<ref>Bolshakova N, Azuaje F (2003) Cluster validation techniques for genome expression data, Signal Processing, Vol. 83, pp. 825–833.</ref> Dunn's index, or Hubert's <math>\Gamma</math> statistic.
* Class prediction analysis: This approach, called supervised classification, establishes the basis for developing a predictive model into which future unknown test objects can be input in order to predict the most likely class membership of the test objects. Supervised analysis<ref name="Peterson"/> for class prediction involves use of techniques such as linear regression, k-nearest neighbor, learning vector quantization, decision tree analysis, random forests, naive Bayes, logistic regression, kernel regression, artificial neural networks, support vector machines, [[mixture of experts]], and supervised neural gas. In addition, various metaheuristic methods are employed, such as [[genetic algorithm]]s, covariance matrix self-adaptation, [[particle swarm optimization]], and [[ant colony optimization]]. Input data for class prediction are usually based on filtered lists of genes which are predictive of class, determined using classical hypothesis tests (next section), Gini diversity index, or information gain (entropy).
* Hypothesis-driven statistical analysis: Identification of statistically significant changes in gene expression are commonly identified using the [[t-test]], [[ANOVA]], [[Bayesian method]]<ref name="Ben-GalShani2005">{{cite journal|last1=Ben Gal|first1=I.|last2=Shani|first2=A.|last3=Gohr|first3=A.|last4=Grau|first4=J.|last5=Arviv|first5=S.|last6=Shmilovici|first6=A.|last7=Posch|first7=S.|last8=Grosse|first8=I.|title=Identification of transcription factor binding sites with variable-order Bayesian networks|journal=Bioinformatics|volume=21|issue=11|year=2005|pages=2657–2666|issn=1367-4803|doi=10.1093/bioinformatics/bti410|pmid=15797905|doi-access=free}}</ref>[[Mann–Whitney test]] methods tailored to microarray data sets, which take into account [[multiple comparisons]]<ref>Yuk Fai Leung and Duccio Cavalieri, Fundamentals of cDNA microarray data analysis. Trends in Genetics Vol.19 No.11 November 2003.</ref> or [[cluster analysis]].<ref name="Priness2007">{{cite journal|author=Priness I.|author2=Maimon O.|author3=Ben-Gal I.|date=2007|title=Evaluation of gene-expression clustering via mutual information distance measure|journal=BMC Bioinformatics|volume=8|issue=1|page=111|doi=10.1186/1471-2105-8-111|pmid=17397530|pmc=1858704}}</ref> These methods assess statistical power based on the variation present in the data and the number of experimental replicates, and can help minimize [[Type I and type II errors]] in the analyses.<ref name="Wei">{{cite journal|author=Wei C |author2=Li J |author3=Bumgarner RE|date= 2004|title=Sample size for detecting differentially expressed genes in microarray experiments|journal=BMC Genomics|volume=5|pages=87|pmid=15533245|doi=10.1186/1471-2164-5-87|pmc=533874}}</ref>
<!-- {{Citation needed|date=July 2008}}as in many other cases where authorities disagree, a sound conservative approach is to directly compare different normalization methods to determine the effects of these different methods on the results obtained. This can be done, for example, by investigating the performance of various methods on data from "spike-in" experiments. {{Citation needed|date=July 2008}} -->
* Dimensional reduction: Analysts often reduce the number of dimensions (genes) prior to data analysis.<ref name="Peterson"/> This may involve linear approaches such as principal components analysis (PCA), or non-linear manifold learning (distance metric learning) using kernel PCA, diffusion maps, Laplacian eigenmaps, local linear embedding, locally preserving projections, and Sammon's mapping.
Line 219:
| date = July 2008
| pmid=18516045
| s2cid = 205418589
}}</ref><ref name="wang2009">{{Cite journal
| doi = 10.1038/nrg2484
|