Sequence data can allow migration/transmission patterns (i.e. who infected whom) to be uncovered.
Genetic samples yield trees: information about events ancestral to samples.
Can use a chemical reaction notation to describe rates and effects of possible events:
The parameters $\lambda$ and $\mu$ are probabilities per [time unit] that any given individual experiences a birth or a death.
Additionally, the model allows each surviving lineage at the end of the process (present day) to be sampled with probability $\rho$.
Gives rise to differential equations which can be solved to obtain the following tree probability: \begin{equation*} P(T|\lambda,\mu,\psi,r,t_0) = g(t_0) =\lambda^{n+m-1}\psi^{k+m}(4\rho)^n\prod_{i=0}^{n+m-1}\frac{1}{q(x_i)}\prod_{i=1}^{m}p_0(y_i)q(y_i) \end{equation*} where $q(t)=4\rho/g(t)$. [Stadler, J. Theor. Biol., 2010]
There are several distinct parameterizations besides the basic $\lambda,\mu,\psi$ parameterization, including:
Probability of coalescence in generation $i-m$: $P(m)=(1-p_{\textrm{coal}})^{m-1}p_{\textrm{coal}}$
Continuous time limit (large $N$, small $g$): $P(t)=e^{-\frac{1}{Ng}t}\frac{1}{Ng}$
Question: How can this be generalized to $k$ samples?
Answer: $p_{\text{coal}}=\frac{k(k-1)}{2}\frac{1}{N}=\binom{k}{2}\frac{1}{N}$
Birth-death model can infer effective reproductive number dynamics:
Coalescent model can infer effective population size dynamics: