Randomized experiment: Difference between revisions

Content deleted Content added
No edit summary
 
(32 intermediate revisions by 21 users not shown)
Line 1:
{{short description|Experiment using randomness in some aspect, usually to aid in removal of bias}}
[[Image:Flowchart of Phases of Parallel Randomized Trial - Modified from CONSORT 2010.png|thumb|250px|right|Flowchart of four phases (enrollment, intervention allocation, follow-up, and data analysis) of a parallel randomized trial of two groups, modified from the [[Consolidated Standards of Reporting Trials|CONSORT 2010 Statement]]<ref name="Schulz-2010">{{Cite journal | author = Schulz KF, Altman DG, Moher D; for the CONSORT Group | title = CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials | journal = BMJ | volume = 340 | pages = c332 | year = 2010 | doi = 10.1136/bmj.c332 | url = http://www.bmj.com/cgi/content/full/340/mar23_1/c332 | pmid = 20332509 | pmc = 2844940 }}</ref>]] In [[scientific method|science]], '''randomized experiments''' are the [[experiment]]s that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in [[experimental design]] and in [[survey sampling]].
 
== Overview ==
Line 9 ⟶ 10:
 
==Online randomized controlled experiments==
Web sites can run randomized controlled experiments<ref>{{cite book
<ref>{{cite book
| last = Kohavi
| first = Ron
Line 23:
| publisher = Springer
| year = 2015
| chapter-url = http://www.exp-platform.com/Documents/2015%20Online%20Controlled%20Experiments_EncyclopediaOfMLDM.pdf
}}</ref> to create a feedback loop.<ref name="surveyarticle">{{cite journal
| authors author1= Kohavi, Ron; |author2=Longbotham, Roger; |author3=Sommerfield, Dan; |author4=Henne, Randal M.
| title = Controlled experiments on the web: survey and practical guide
| journal = Data Mining and Knowledge Discovery
Line 31:
| issue = 1
| pages = 140–181
| publisher = Springer
| ___location = Berlin
| year = 2009
| url = http://www.springerlink.com/content/r28m75k77u145115/
| issn = 1384-5810
| doi = 10.1007/s10618-008-0114-1
| doi-access = free
}}</ref> Key differences between offline experimentation and online experiments include:<ref name="surveyarticle"/><ref name="puzzlingResults">{{cite conference
| url= http://www.exp-platform.com/Pages/PuzzingOutcomesExplained.aspx
| title=Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained
Line 43 ⟶ 41:
| first = Ron
|author2=Deng, Alex |author3=Frasca, Brian |author4=Longbotham, Roger |author5=Walker, Toby |author6= Xu Ya
| booktitlebook-title = Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
| year = 2012}}</ref>
 
* Logging: user interactions can be logged reliably.
* Number of users: large sites, such as Amazon, Bing/Microsoft, and Google run experiments, each with over a million users.
* Number of concurrent experiments: large sites run tens of overlapping, or concurrent, experiments.<ref name="ExPScale">{{cite conferencebook
| last = Kohavi
| first = Ron
|author2=Deng Alex |author3=Frasca Brian |author4=Walker Toby |author5=Xu Ya |author6= Nils Pohlmann
| journaltitle = Proceedings of the 19th ACM SIGKDD Conferenceinternational conference on Knowledge Discoverydiscovery and Datadata Miningmining
| title = Online Controlled Experiments at Large Scale
| chapter = Online controlled experiments at large scale
| journal = Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
| yeardate = 2013
| volume = 19
| pages = 1168–1176
| publisher = ACM
| ___location = Chicago, Illinois, USA
| year = 2013
| url = http://dx.doi.org/10.1145/2487575.2488217
| doi = 10.1145/2487575.2488217
| isbn = 9781450321747
}}</ref>
| s2cid = 13224883
* Robots, whether [[web crawlers]] from valid sources or malicious [[internet bots]].
}}</ref>
* Robots, whether [[web crawlers]] from valid sources or malicious [[internet bots]].{{clarify|reason=what about them? do they affect the reliability of the results?|date=May 2019}}
* Ability to ramp-up experiments from low percentages to higher percentages.
* Speed / performance has significant impact on key metrics.<ref name="surveyarticle" /><ref name="ExPRulesOfThumb">
{{cite conferencebook
| last = Kohavi
| first = Ron
| author2=Deng Alex |author3=Longbotham Roger |author4=Xu Ya
| journaltitle = Proceedings of the 20th ACM SIGKDD Conferenceinternational conference on Knowledge Discoverydiscovery and Datadata Miningmining
| url = http://www.exp-platform.com/Pages/SevenRulesofThumbforWebSiteExperimenters.aspx
| titlechapter = Seven Rulesrules of Thumbthumb for Webweb Sitesite Experimentersexperimenters
| yeardate = 2014
| journal = Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
| chapter-url = http://www.exp-platform.com/Pages/SevenRulesofThumbforWebSiteExperimenters.aspx
| volume = 20
| pages = 1857–1866
| publisher = ACM
| ___location = New York, New York, USA
| year = 2014
| doi = 10.1145/2623330.2623341
| isbn = 9781450329569
}}</ref>
| s2cid = 207214362
}}</ref>
 
* Ability to use the pre-experiment period as an A/A test to reduce variance.<ref name="cuped">{{cite conference
Line 85 ⟶ 86:
| last = Deng
| first = Alex |author2=Xu, Ya |author3=Kohavi, Ron |author4=Walker, Toby
| booktitlebook-title = WSDM 2013: Sixth ACM International Conference on Web Search and Data Mining
| year = 2013}}</ref>
 
Line 92 ⟶ 93:
{{main|History of experiments}}
 
The earliestA controlled experiment appears to have been suggested in the Old Testament's [[Book of Daniel]]. King Nebuchadnezzar proposed that some Israelites eat "a daily amount of food and wine from the king's table." Daniel preferred a [[Vegetarian cuisine|vegetarian]] diet, but the official was concerned that the king would "see you looking worse than the other young men your age? The king would then have my head because of you." Daniel then proposed the following controlled experiment: "Test your servants for ten days. Give us nothing but vegetables to eat and water to drink. Then compare our appearance with that of the young men who eat the royal food, and treat your servants in accordance with what you see". ([[Daniel 1, 12– 13]]:12–13).<ref>{{cite journal
| last = Neuhauser
| first = D
|author2=Diaz, M
| title = Daniel: using the Bible to teach quality improvement methods
| journal = Quality and Safety in Health Care 2004
| volume = 13
| issue = 2
| pages = 153–155
| year = 2004
| url = http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1743807/pdf/v013p00153.pdf
| doi = 10.1136/qshc.2003.009480
| pmid=15069225
Line 116:
</ref>
 
Randomized experiments were institutionalized in psychology and education in the late eighteen-hundreds, following the invention of randomized experiments by [[Charles Sanders Peirce|C. S. Peirce]].<ref>{{cite journal| author=[[Charles Sanders Peirce]] and [[Joseph Jastrow]]| year=1885|title=On Small Differences in Sensation| journal=Memoirs of the National Academy of Sciences|volume=3|pages=73–83|url=http://psychclassics.yorku.ca/Peirce/small-diffs.htm}} http://psychclassics.yorku.ca/Peirce/small-diffs.htm</ref><ref>{{cite journal| doi=10.1086/354775| first=Ian |last=Hacking| authorlinkauthor-link=Ian Hacking | title=Telepathy: Origins of Randomization in Experimental Design|journal=[[Isis (journal)| Isis]]| issue=3| volume=79| date=September 1988 |pages=427–451| mr = 1013489| jstor=234674| s2cid=52201011 }}</ref><ref>{{cite journal| doi=10.1086/444032|author=[[Stephen M. Stigler]]|title=A Historical View of Statistical Concepts in Psychology and Educational Research| journal=American Journal of Education| volume=101| issue=1| date=November 1992|pages=60–70|s2cid=143685203|author-link=Stephen M. Stigler}}</ref><ref>{{cite journal|doi=10.1086/383850|author=Trudy Dehue|title=Deception, Efficiency, and Random Groups: Psychology and the Gradual Origination of the Random Group Design|journal=[[Isis (journal)|Isis]]| volume=88| issue=4| date=December 1997| pages=653–673|pmid=9519574|s2cid=23526321|url=https://pure.rug.nl/ws/files/71855616/237831.pdf}}</ref>
Outside of psychology and education, randomized experiments were popularized by [[R.A. Fisher]] in his book ''[[Statistical Methods for Research Workers]]'', which also introduced additional principles of experimental design.
 
==Statistical interpretation==
{{Expand section|date=September 2012}}
 
The [[Rubin Causal Model]] provides a common way to describe a randomized experiment. While the Rubin Causal Model provides a framework for defining the causal parameters (i.e., the effects of a randomized treatment on an outcome), the analysis of experiments can take a number of forms. The model assumes that there are two potential outcomes for each unit in the study: the outcome if the unit receives the treatment and the outcome if the unit does not receive the treatment. The difference between these two potential outcomes is known as the treatment effect, which is the causal effect of the treatment on the outcome. Most commonly, randomized experiments are analyzed using [[ANOVA]], [[student's t-test]], [[regression analysis]], or a similar [[Statistical hypothesis testing|statistical test]]. The model also accounts for potential confounding factors, which are factors that could affect both the treatment and the outcome. By controlling for these confounding factors, the model helps to ensure that any observed treatment effect is truly causal and not simply the result of other factors that are correlated with both the treatment and the outcome.
 
The Rubin Causal Model is a useful a framework for understanding how to estimate the causal effect of the treatment, even when there are confounding variables that may affect the outcome. This model specifies that the causal effect of the treatment is the difference in the outcomes that would have been observed for each individual if they had received the treatment and if they had not received the treatment. In practice, it is not possible to observe both potential outcomes for the same individual, so statistical methods are used to estimate the causal effect using data from the experiment.
 
==Empirical evidence that randomization makes a difference==
Empirically differences between randomized and non-randomized studies,<ref>{{cite journal| doi=10.1002/14651858.MR000034.pub2|authorvauthors=Anglemyer A, Horvath HT, Bero L | title=Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials| journal=Cochrane Database Syst Rev|date=April 2014| pmid=24782322| volume=2014|issue=4 | pages=MR000034| pmc=8191367}}</ref>{{Update inline|reason=Updated version https://www.ncbi.nlm.nih.gov/pubmed/38174786|date = February 2024}} and between adequately and inadequately randomized trials have been difficult to detect. <ref>{{cite journal| doi=10.1002/14651858.MR000012.pub3| vauthors=Odgaard-Jensen J, Vist G, etal |title=Randomisation to protect against selection bias in healthcare trials.| journal=Cochrane Database Syst Rev| date=April 2011| volume=2015 |pmid=21491415|pages=MR000012| issue=4| pmc=7150228}}</ref><ref>{{cite journal| doi=10.1186/1745-6215-15-480|authorvauthors=Howick J, Mebius A |title=In search of justification for the unpredictability paradox| journal=Trials| year=2014| volume=15| pmid=25490908| pagesarticle-number=480| pmc=4295227 |doi-access=free }}</ref>
 
== Directed acyclic graph (DAG) explanation of randomization ==
Randomization is the cornerstone of many scientific claims. To randomize, means that we can eliminate the confounding factors. Say we study the effect of '''A''' on '''B.''' Yet, there are many unobservables '''U''' that potentially affect '''B''' and confound our estimate of the finding. To explain these kinds of issues, statisticians or econometricians nowadays use [[directed acyclic graph]].{{Needs update|date=July 2024}}
 
==See also==
*[[A/B testing]]
*[[Allocation concealment]]
*[[Random assignment]]
*[[Randomized block design]]
Line 135 ⟶ 141:
==References==
{{Reflist}}
* {{cite book |author1 = Caliński, Tadeusz
|author=Caliński,author2 Tadeusz and = Kageyama, Sanpei
|name-list-style = amp
|title = Block designs: A Randomization approach, Volume '''I''': Analysis
|series = Lecture Notes in Statistics
|volume = 150
|publisher=Springer-Verlag
| publisher = Springer-Verlag
|___location=New York
| ___location = BerlinNew York
|year = 2000
|isbn = 978-0-387-98578-67
|url-access = registration
|url = https://archive.org/details/blockdesignsrand0002cali
}}
* {{cite book |author1 = Caliński, Tadeusz
|author=Caliński,author2 Tadeusz and = Kageyama, Sanpei
|name-list-style = amp
|title = Block designs: A Randomization approach, Volume '''II''': Design
|series = Lecture Notes in Statistics
|volume = 170
|publisher=Springer-Verlag
|publisher = Springer-Verlag
|___location=New York
|___location = New York
|year = 2003
|isbn = 978-0-387-95470-87
|url-access = registration
|url = https://archive.org/details/blockdesignsrand0002cali
}}
* {{cite journal|doi=10.1086/354775|first=Ian |last=Hacking| authorlinkauthor-link=Ian Hacking | title=Telepathy: Origins of Randomization in Experimental Design|journal=[[Isis (journal)|Isis]]|issue=3|volume=79|date=September 1988 |pages=427–451| mr = 1013489| jstor=234674|s2cid=52201011 }}
*{{cite book| last1=Hinkelmann| first1=Klaus| last2=Kempthorne| first2=Oscar| year=2008| title=Design and Analysis of Experiments, Volume I: Introduction to Experimental Design| url=https://books.google.com/books?id=T3wWj2kVYZgC&printsec=frontcover| edition=Second| publisher= Wiley | isbn=978-0-471-72756-9 |mr=2363107 |authorlink2author-link2=Oscar Kempthorne}}
* {{cite book| last=Kempthorne|first=Oscar |chapter=Intervention experiments, randomization and inference|title=Current Issues in Statistical Inference&mdash;Essays in Honor of D. Basu | editor=Malay Ghosh and Pramod K. Pathak | pages=13&ndash;31 | publisher=Institute for Mathematical Statistics |___location=Hayward, CA | chapter-url=http://projecteuclid.org/euclid.lnms/1215458836 | doi=10.1214/lnms/1215458836 | mr=1194407|authorlinkauthor-link=Oscar Kempthorne|series=Institute of Mathematical Statistics Lecture Notes - Monograph Series |year=1992 |isbn=978-0-940600-24-9 }}
 
{{Experimental design}}