Content deleted Content added
Extend the previous stub. I started to detail the functionning of Felsenstein's pruning algorith. It still needs exemples, illustrations and, of course, the algorithme itselfs. |
m fix spacing around math (via WP:JWB) |
||
(5 intermediate revisions by 5 users not shown) | |||
Line 4:
The algorithm is often used as a subroutine in a search for a [[maximum likelihood]] estimate for an evolutionary tree. Further, it can be used in a hypothesis test for whether evolutionary rates are constant (by using [[likelihood ratio test]]s). It can also be used to provide error estimates for the parameters describing an evolutionary tree.
[[File:Tree_exemple.png|thumb|A simple phylogenetic tree example made from arbitrary data D]]
The '''likelihood''' of a tree <math>T</math> is, by definition, the probability of observing certain data <math>D</math> (<math>D</math> being a nucleotide sequence
Here is an example of an evolutionary tree on arbitrary sequence data <math>D</math>:
This is a key value and is often quite complicated to compute. To ease the computations, Felsenstein and his colleagues used several assumptions that are still widely used today. The '''main assumption''' is that '''mutations between DNA sites are
[[File:Tree_partial_exemple.png|thumb|Same tree but made from D1, which consists in the first DNA sites from D]]
<math>
P(D|T) = \prod_{s=1}^{n} {P(D_s|T)}
</math>
If I reuse the
The '''second assumption''' concerns the [[Substitution model|models of
Felsenstein proposed to decomposed computations even more by using "partial likelihoods" in the computation of <math>
P( D_s | T)
</math>. Here is how it works.
Assume we are on a node <math> k
</math> on the tree <math>
Line 39 ⟶ 42:
<math> w_k (X) = ( \sum_Y p_{X \rightarrow Y} \centerdot w_i (Y)) \centerdot ( \sum_Z p_{X \rightarrow Z} \centerdot w_j (Z)) </math>
where <math> Y </math> and <math> Z </math> are also DNA bases. <math> p_{ X\rightarrow Y} </math> is the transition probability from nucleotide <math>X</math> to nucleotide <math> Y </math> (idem for <math> p_{X \rightarrow Z} </math>). <math> w_i(Y) </math> is the partial likelihood of the daughter node <math>
i
</math>, evaluated on nucleotide <math> Y </math> (idem for <math>
Line 49 ⟶ 52:
<math> P(D_s|T) = \sum_X p_X \centerdot w_r (X) </math>
After doing so for every site <math>s</math>, one can finally obtain the likelihood of the global evolutionary tree by multiplying each "sublikelihood".
== Algorithm ==
==References==
|