Concentration parameter: Difference between revisions

Content deleted Content added
sections
order
Line 11:
==Sparse prior==
An example of where a sparse prior (concentration parameter much less than 1) is called for, consider a [[topic model]], which is used to learn the topics that are discussed in a set of documents, where each "topic" is described using a [[categorical distribution]] over a vocabulary of words. A typical vocabulary might have 100,000 words, leading to a 100,000-dimensional categorical distribution. The [[prior distribution]] for the parameters of the categorical distribution would likely be a [[symmetric Dirichlet distribution]]. However, a coherent topic might only have a few hundred words with any significant probability mass. Accordingly, a reasonable setting for the concentration parameter might be 0.01 or 0.001. With a larger vocabulary of around 1,000,000 words, an even smaller value, e.g. 0.0001, might be appropriate.
 
== References ==
{{reflist}}
 
==See also==
Line 21 ⟶ 18:
* [[Location parameter]]
* [[Scale parameter]]
 
== References ==
{{reflist}}
 
 
[[Category:Statistical parameters]]