Content deleted Content added
Began merge of Significance analysis of microarrays page, and some headings |
|||
Line 7:
Microarray data analysis involves several distinct steps, as outlined below. Changing any one of the steps will change the outcome of the analysis, so the MAQC Project<ref>{{cite web | url = http://www.fda.gov/nctr/science/centers/toxicoinformatics/maqc/ | title = MicroArray Quality Control (MAQC) Project | accessdate = 2007-12-26 | author = Dr. Leming Shi, National Center for Toxicological Research | publisher = U.S. Food and Drug Administration }}</ref> was created to identify a set of standard strategies. Companies exist that use the MAQC protocols to perform a complete analysis.<ref>{{cite web |url=http://www.genusbiosystems.com/services-data.shtml |title=GenUs BioSystems - Services - Data Analysis |accessdate=2008-01-02 |work=}}</ref>
==Techniques==
Most microarray manufacturers, such as [[Affymetrix]] and [[Agilent]],<ref>{{cite web|url=http://www.chem.agilent.com/Scripts/PCol.asp?lPage=494 |title=Agilent | DNA Microarrays |accessdate=2008-01-02 |format= |work= |deadurl=yes |archiveurl=https://web.archive.org/web/20071222130157/http://www.chem.agilent.com/Scripts/PCol.asp?lPage=494 |archivedate=December 22, 2007 }}</ref> provide commercial data analysis software with microarray equipment such as plate readers.
===Significance analysis of microarrays (SAM)===
Significance analysis of microarrays (SAM) is a [[statistics|statistical technique]] for determining whether changes in [[gene expression]] are statistically significant. It was established in 2001 by Virginia Tusher, [[Robert Tibshirani]] and [[Gilbert Chu]], and is distributed in an [[R (programming language)|R-package]] by [[Stanford University]].
SAM identifies statistically significant genes by carrying out gene specific [[Student's t-test|t-tests]] and computes a statistic ''d<sub>j</sub>'' for each gene ''j'', which measures the strength of the relationship between gene expression and a response variable.<ref name="R1"/><ref name="R7"/><ref name="R8"/> This analysis uses [[non-parametric statistics]], since the data may not follow a [[normal distribution]]. The response variable describes and groups the data based on experimental conditions. In this method, repeated [[permutations]] of the data are used to determine if the expression of any gene is significant related to the response. The use of permutation-based analysis accounts for correlations in genes and avoids [[wikt:Special:Search/parametric|parametric]] assumptions about the distribution of individual genes. This is an advantage over other techniques (e.g., [[ANOVA]] and [[Bonferroni correction]]), which assume equal variance and/or independence of genes.<ref name="R6"/>
==Background correction==▼
Depending on the type of array, signal related to nonspecific binding of the fluorophore can be subtracted to achieve better results. One approach involves subtracting the average▼
signal intensity of the area between spots. A variety of tools for background correction and further analysis are available from TIGR,<ref>{{cite web |url=http://www.tigr.org/software/microarray.shtml |title=J. Craig Venter Institute -- Software |accessdate=2008-01-01 |work=}}</ref> Agilent ([[GeneSpring]]),<ref>{{cite web |url=http://www.chem.agilent.com/scripts/pds.asp?lpage=27881 |title=Agilent | GeneSpring GX |accessdate=2008-01-02 |format= |work=}}</ref> and [[Ocimum Bio Solutions]] (Genowiz).<ref>{{cite web |url=http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |title=Ocimum Biosolutions | Genowiz |accessdate=2009-04-02 |format= |work= |deadurl=yes |archiveurl=https://web.archive.org/web/20091124165434/http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |archivedate=2009-11-24 |df= }}</ref>▼
==
===Quality control===
Entire arrays may have obvious flaws detectable by visual inspection, pairwise comparisons to arrays in the same experimental group, or by analysis of RNA degradation.<ref>{{cite journal |vauthors=Wilson CL, Miller CJ |title=Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis |journal=Bioinformatics |volume=21 |issue=18 |pages=3683–5 |year=2005 |pmid=16076888 |doi=10.1093/bioinformatics/bti605}}</ref> Results may improve by removing these arrays from the analysis entirely.
▲===Background correction===
==Spot filtering==▼
▲Depending on the type of array, signal related to nonspecific binding of the fluorophore can be subtracted to achieve better results. One approach involves subtracting the average
▲signal intensity of the area between spots. A variety of tools for background correction and further analysis are available from TIGR,<ref>{{cite web |url=http://www.tigr.org/software/microarray.shtml |title=J. Craig Venter Institute -- Software |accessdate=2008-01-01 |work=}}</ref> Agilent ([[GeneSpring]]),<ref>{{cite web |url=http://www.chem.agilent.com/scripts/pds.asp?lpage=27881 |title=Agilent | GeneSpring GX |accessdate=2008-01-02 |format= |work=}}</ref> and [[Ocimum Bio Solutions]] (Genowiz).<ref>{{cite web |url=http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |title=Ocimum Biosolutions | Genowiz |accessdate=2009-04-02 |format= |work= |deadurl=yes |archiveurl=https://web.archive.org/web/20091124165434/http://www3.ocimumbio.com/data-analysis-insights/analytical-tools/genowiz/ |archivedate=2009-11-24 |df= }}</ref>
▲===Spot filtering===
Visual identification of local artifacts, such as printing or washing defects, may likewise suggest the removal of individual spots. This can take a substantial amount of time depending on the quality of array manufacture. In addition, some procedures call for the elimination of all spots with an expression value below a certain intensity threshold.
|