Microarray analysis techniques: Difference between revisions

Content deleted Content added
Tag (talk | contribs)
Kf811812 (talk | contribs)
Began merge of Significance analysis of microarrays page, and some headings
Line 7:
Microarray data analysis involves several distinct steps, as outlined below. Changing any one of the steps will change the outcome of the analysis, so the MAQC Project<ref>{{cite web | url = http://www.fda.gov/nctr/science/centers/toxicoinformatics/maqc/ | title = MicroArray Quality Control (MAQC) Project | accessdate = 2007-12-26 | author = Dr. Leming Shi, National Center for Toxicological Research | publisher = U.S. Food and Drug Administration }}</ref> was created to identify a set of standard strategies. Companies exist that use the MAQC protocols to perform a complete analysis.<ref>{{cite web |url=http://www.genusbiosystems.com/services-data.shtml |title=GenUs BioSystems - Services - Data Analysis |accessdate=2008-01-02 |work=}}</ref>
 
==Techniques==
==Creating raw data==
Most microarray manufacturers, such as [[Affymetrix]] and [[Agilent]],<ref>{{cite web|url=http://www.chem.agilent.com/Scripts/PCol.asp?lPage=494 |title=Agilent &#124; DNA Microarrays |accessdate=2008-01-02 |format= |work= |deadurl=yes |archiveurl=https://web.archive.org/web/20071222130157/http://www.chem.agilent.com/Scripts/PCol.asp?lPage=494 |archivedate=December 22, 2007 }}</ref> provide commercial data analysis software with microarray equipment such as plate readers.
===Significance analysis of microarrays (SAM)===
Significance analysis of microarrays (SAM) is a [[statistics|statistical technique]] for determining whether changes in [[gene expression]] are statistically significant. It was established in 2001 by Virginia Tusher, [[Robert Tibshirani]] and [[Gilbert Chu]], and is distributed in an [[R (programming language)|R-package]] by [[Stanford University]].
 
SAM identifies statistically significant genes by carrying out gene specific [[Student's t-test|t-tests]] and computes a statistic ''d<sub>j</sub>'' for each gene ''j'', which measures the strength of the relationship between gene expression and a response variable.<ref name="R1"/><ref name="R7"/><ref name="R8"/> This analysis uses [[non-parametric statistics]], since the data may not follow a [[normal distribution]]. The response variable describes and groups the data based on experimental conditions. In this method, repeated [[permutations]] of the data are used to determine if the expression of any gene is significant related to the response. The use of permutation-based analysis accounts for correlations in genes and avoids [[wikt:Special:Search/parametric|parametric]] assumptions about the distribution of individual genes. This is an advantage over other techniques (e.g., [[ANOVA]] and [[Bonferroni correction]]), which assume equal variance and/or independence of genes.<ref name="R6"/>
==Background correction==
Depending on the type of array, signal related to nonspecific binding of the fluorophore can be subtracted to achieve better results. One approach involves subtracting the average
signal intensity of the area between spots. A variety of tools for background correction and further analysis are available from TIGR,<ref>{{cite web |url=http://www.tigr.org/software/microarray.shtml |title=J. Craig Venter Institute -- Software |accessdate=2008-01-01 |work=}}</ref> Agilent ([[GeneSpring]]),<ref>{{cite web |url=http://www.chem.agilent.com/scripts/pds.asp?lpage=27881 |title=Agilent &#124; GeneSpring GX |accessdate=2008-01-02 |format= |work=}}</ref> and [[Ocimum Bio Solutions]] (Genowiz).<ref>{{cite web |url=http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |title=Ocimum Biosolutions &#124; Genowiz |accessdate=2009-04-02 |format= |work= |deadurl=yes |archiveurl=https://web.archive.org/web/20091124165434/http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |archivedate=2009-11-24 |df= }}</ref>
 
==QualityError correction and quality control==
===Quality control===
Entire arrays may have obvious flaws detectable by visual inspection, pairwise comparisons to arrays in the same experimental group, or by analysis of RNA degradation.<ref>{{cite journal |vauthors=Wilson CL, Miller CJ |title=Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis |journal=Bioinformatics |volume=21 |issue=18 |pages=3683–5 |year=2005 |pmid=16076888 |doi=10.1093/bioinformatics/bti605}}</ref> Results may improve by removing these arrays from the analysis entirely.
 
===Background correction===
==Spot filtering==
Depending on the type of array, signal related to nonspecific binding of the fluorophore can be subtracted to achieve better results. One approach involves subtracting the average
signal intensity of the area between spots. A variety of tools for background correction and further analysis are available from TIGR,<ref>{{cite web |url=http://www.tigr.org/software/microarray.shtml |title=J. Craig Venter Institute -- Software |accessdate=2008-01-01 |work=}}</ref> Agilent ([[GeneSpring]]),<ref>{{cite web |url=http://www.chem.agilent.com/scripts/pds.asp?lpage=27881 |title=Agilent &#124; GeneSpring GX |accessdate=2008-01-02 |format= |work=}}</ref> and [[Ocimum Bio Solutions]] (Genowiz).<ref>{{cite web |url=http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |title=Ocimum Biosolutions &#124; Genowiz |accessdate=2009-04-02 |format= |work= |deadurl=yes |archiveurl=https://web.archive.org/web/20091124165434/http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |archivedate=2009-11-24 |df= }}</ref>
 
===Spot filtering===
Visual identification of local artifacts, such as printing or washing defects, may likewise suggest the removal of individual spots. This can take a substantial amount of time depending on the quality of array manufacture. In addition, some procedures call for the elimination of all spots with an expression value below a certain intensity threshold.