Revision as of 09:05, 22 September 2021 edit Shiznick (talk \| contribs) 119 edits Added a hatnote to enable readers to access "concentration matrix" if they don't already know what it means -- especially useful if they happen to search for "concentration" and "statistics" and ended up here (like I just did) ← Previous edit		Revision as of 21:29, 26 August 2023 edit undo HeyElliott (talk \| contribs) Extended confirmed users 121,470 edits →Dirichlet distribution: MOS:NOTE Tag: 2017 wikitext editor Next edit →
Line 9: In the case of multivariate Dirichlet distributions, there is some confusion over how to define the concentration parameter. In the topic modelling literature, it is often defined as the sum of the individual Dirichlet parameters,<ref>{{Cite conference\|last=Wallach\|first=Hanna M.\|author-link=Hanna Wallach\|author2=Iain Murray\|author3=Ruslan Salakhutdinov\|author4=David Mimno\|date=2009\|title=Evaluation methods for topic models\|series=ICML '09\|___location=New York, NY, USA\|publisher=ACM\|pages=1105–1112\|doi=10.1145/1553374.1553515\|isbn=978-1-60558-516-1\|book-title=Proceedings of the 26th Annual International Conference on Machine Learning}}</ref> when discussing symmetric Dirichlet distributions (where the parameters are the same for all dimensions) it is often defined to be the value of the single Dirichlet parameter used in all dimensions{{Citation needed\|date=November 2011}}. This second definition is smaller by a factor of the dimension of the distribution. A concentration parameter of 1 (or ''k'', the dimension of the Dirichlet distribution, by the definition used in the topic modelling literature) results in all sets of probabilities being equally likely, i.e., in this case the Dirichlet distribution of dimension ''k'' is equivalent to a uniform distribution over a [[Standard simplex\|''k-1''-dimensional simplex]]. ~~Note that this~~This is ''not'' the same as what happens when the concentration parameter tends towards infinity. In the former case, all resulting distributions are equally likely (the distribution over distributions is uniform). In the latter case, only near-uniform distributions are likely (the distribution over distributions is highly peaked around the uniform distribution). Meanwhile, in the limit as the concentration parameter tends towards zero, only distributions with nearly all mass concentrated on one of their components are likely (the distribution over distributions is highly peaked around the ''k'' possible [[Dirac delta distribution]]s centered on one of the components, or in terms of the ''k''-dimensional simplex, is highly peaked at corners of the simplex). ==Sparse prior==

Concentration parameter: Difference between revisions