Revision as of 23:53, 14 April 2020 edit Staticshakedown (talk \| contribs) Autopatrolled, Event coordinators, Extended confirmed users, New page reviewers 27,138 edits sections ← Previous edit		Revision as of 23:53, 14 April 2020 edit undo Staticshakedown (talk \| contribs) Autopatrolled, Event coordinators, Extended confirmed users, New page reviewers 27,138 edits order Next edit →
Line 11: ==Sparse prior== An example of where a sparse prior (concentration parameter much less than 1) is called for, consider a [[topic model]], which is used to learn the topics that are discussed in a set of documents, where each "topic" is described using a [[categorical distribution]] over a vocabulary of words. A typical vocabulary might have 100,000 words, leading to a 100,000-dimensional categorical distribution. The [[prior distribution]] for the parameters of the categorical distribution would likely be a [[symmetric Dirichlet distribution]]. However, a coherent topic might only have a few hundred words with any significant probability mass. Accordingly, a reasonable setting for the concentration parameter might be 0.01 or 0.001. With a larger vocabulary of around 1,000,000 words, an even smaller value, e.g. 0.0001, might be appropriate. == References ==▼ {{reflist}}▼ ==See also== Line 21 ⟶ 18: * [[Location parameter]] * [[Scale parameter]] ▲== References == ▲{{reflist}} [[Category:Statistical parameters]]

Concentration parameter: Difference between revisions