Randomized experiment: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 06:07, 4 December 2021 edit Citation bot (talk \| contribs) Bots 5,868,508 edits Alter: volume, url. URLs might have been anonymized. Add: volume, s2cid. \| Use this bot. Report bugs. \| Suggested by AManWithNoPlan \| #UCB_webform 534/2199 ← Previous edit		Latest revision as of 20:48, 19 August 2025 edit undo Cosmia Nebula (talk \| contribs) Extended confirmed users 11,304 edits m →History Tag: Visual edit
(11 intermediate revisions by 6 users not shown)
Line 10: ==Online randomized controlled experiments== Web sites can run randomized controlled experiments<ref>{{cite book ~~<ref>{{cite book~~ \| last = Kohavi \| first = Ron Line 47 ⟶ 46: * Logging: user interactions can be logged reliably. * Number of users: large sites, such as Amazon, Bing/Microsoft, and Google run experiments, each with over a million users. * Number of concurrent experiments: large sites run tens of overlapping, or concurrent, experiments.<ref name="ExPScale">{{cite ~~conference~~book \| last = Kohavi \| first = Ron \|author2=Deng Alex \|author3=Frasca Brian \|author4=Walker Toby \|author5=Xu Ya \|author6= Nils Pohlmann \| ~~journal~~title = Proceedings of the 19th ACM SIGKDD ~~Conference~~international conference on Knowledge ~~Discovery~~discovery and ~~Data~~data ~~Mining~~mining▼ ~~\| title = Online Controlled Experiments at Large Scale~~ \| chapter = Online controlled experiments at large scale ▲ \| journal = Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining \| ~~year~~date = 2013▼ \| volume = 19 \| pages = 1168–1176 \| publisher = ACM \| ___location = Chicago, Illinois, USA ▲ \| year = 2013 \| doi = 10.1145/2487575.2488217 \| isbn = 9781450321747 }}</ref>▼ \| s2cid = 13224883 ▲ }}</ref> * Robots, whether [[web crawlers]] from valid sources or malicious [[internet bots]].{{clarify\|reason=what about them? do they affect the reliability of the results?\|date=May 2019}} * Ability to ramp-up experiments from low percentages to higher percentages. * Speed / performance has significant impact on key metrics.<ref name="surveyarticle" /><ref name="ExPRulesOfThumb"> {{cite ~~conference~~book \| last = Kohavi \| first = Ron \| author2=Deng Alex \|author3=Longbotham Roger \|author4=Xu Ya \| ~~journal~~title = Proceedings of the 20th ACM SIGKDD ~~Conference~~international conference on Knowledge ~~Discovery~~discovery and ~~Data~~data ~~Mining~~mining▼ \| url = http://www.exp-platform.com/Pages/SevenRulesofThumbforWebSiteExperimenters.aspx▼ \| ~~title~~chapter = Seven ~~Rules~~rules of ~~Thumb~~thumb for ~~Web~~web ~~Site~~site ~~Experimenters~~experimenters \| ~~year~~date = 2014▼ ▲ \| journal = Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining ▲ \| chapter-url = http://www.exp-platform.com/Pages/SevenRulesofThumbforWebSiteExperimenters.aspx \| volume = 20 \| pages = 1857–1866 \| publisher = ACM \| ___location = New York, New York, USA ▲ \| year = 2014 \| doi = 10.1145/2623330.2623341 \| isbn = 9781450329569 }}</ref>▼ \| s2cid = 207214362 ▲ }}</ref> * Ability to use the pre-experiment period as an A/A test to reduce variance.<ref name="cuped">{{cite conference Line 90 ⟶ 93: {{main\|History of experiments}} A controlled experiment appears to have been suggested in the Old Testament's [[Book of Daniel]]. King Nebuchadnezzar proposed that some Israelites eat "a daily amount of food and wine from the king's table." Daniel preferred a [[Vegetarian cuisine\|vegetarian]] diet, but the official was concerned that the king would "see you looking worse than the other young men your age? The king would then have my head because of you." Daniel then proposed the following controlled experiment: "Test your servants for ten days. Give us nothing but vegetables to eat and water to drink. Then compare our appearance with that of the young men who eat the royal food, and treat your servants in accordance with what you see". ([[Daniel 1~~, 12– 13~~]]:12–13).<ref>{{cite journal \| last = Neuhauser \| first = D Line 119 ⟶ 122: {{Expand section\|date=September 2012}} The [[Rubin Causal Model]] provides a common way to describe a randomized experiment. While the Rubin Causal Model provides a framework for defining the causal parameters (i.e., the effects of a randomized treatment on an outcome), the analysis of experiments can take a number of forms. The model assumes that there are two potential outcomes for each unit in the study: the outcome if the unit receives the treatment and the outcome if the unit does not receive the treatment. The difference between these two potential outcomes is known as the treatment effect, which is the causal effect of the treatment on the outcome. Most commonly, randomized experiments are analyzed using [[ANOVA]], [[student's t-test]], [[regression analysis]], or a similar [[Statistical hypothesis testing\|statistical test]]. The model also accounts for potential confounding factors, which are factors that could affect both the treatment and the outcome. By controlling for these confounding factors, the model helps to ensure that any observed treatment effect is truly causal and not simply the result of other factors that are correlated with both the treatment and the outcome. The Rubin Causal Model is a useful a framework for understanding how to estimate the causal effect of the treatment, even when there are confounding variables that may affect the outcome. This model specifies that the causal effect of the treatment is the difference in the outcomes that would have been observed for each individual if they had received the treatment and if they had not received the treatment. In practice, it is not possible to observe both potential outcomes for the same individual, so statistical methods are used to estimate the causal effect using data from the experiment. ==Empirical evidence that randomization makes a difference== Empirically differences between randomized and non-randomized studies,<ref>{{cite journal\| doi=10.1002/14651858.MR000034.pub2\|vauthors=Anglemyer A, Horvath HT, Bero L \| title=Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials\| journal=Cochrane Database Syst Rev\|date=April 2014\| pmid=24782322\| volume=2014\|issue=4 \| pages=MR000034\| pmc=8191367}}</ref>{{Update inline\|reason=Updated version https://www.ncbi.nlm.nih.gov/pubmed/38174786\|date = February 2024}} and between adequately and inadequately randomized trials have been difficult to detect.<ref>{{cite journal\| doi=10.1002/14651858.MR000012.pub3\| vauthors=Odgaard-Jensen J, Vist G, etal \|title=Randomisation to protect against selection bias in healthcare trials.\| journal=Cochrane Database Syst Rev\| date=April 2011\| volume=2015 \|pmid=21491415\|pages=MR000012\| issue=4\| pmc=7150228}}</ref><ref>{{cite journal\| doi=10.1186/1745-6215-15-480\|vauthors=Howick J, Mebius A \|title=In search of justification for the unpredictability paradox\| journal=Trials\| year=2014\| volume=15\| pmid=25490908\| ~~pages~~article-number=480\| pmc=4295227 \|doi-access=free }}</ref> == Directed acyclic graph (DAG) explanation of randomization == Randomization is the cornerstone of many scientific claims. To randomize, means that we can eliminate the confounding factors. Say we study the effect of '''A''' on '''B.''' Yet, there are many unobservables '''U''' that potentially affect '''B''' and confound our estimate of the finding. To explain these kinds of issues, statisticians or econometricians nowadays use [[directed acyclic graph]].{{Needs update\|date=July 2024}} ==See also==