Investigating seasonal NZ influenza dynamics using
genomic and mobile phone data
Tim Vaughan, André Lichtsteiner,
David Welch, Alexei Drummond
Centre for Computational Evolution
The University of Auckland
21st Annual New Zealand Phylogenomics Meeting
12-17 February, 2017
Waiheke Island
Influenza
- RNA virus with a 14 kb genome
consisting of 8 distinct segments.
- Evolving $\sim 10^6$ times faster than human
genome: measurably evolving pathogen.
- Seasonal epidemics occur
repeatably around the world in predictable
order.
Perfect system for the application of phylodynamic methods.
Genomic data
- ESR have sequenced 141 full H3N2 influenza genomes.
- Sequences from isolates sampled during 2012, 2013 and 2014 seasons.
- Samples distributed across 18 of the 20 New Zealand District Health Boards.
Host movement data
- QRIOUS (qrious.co.nz)
have generously provided anonymized and
aggregated information on movement of IMSIs
within NZ.
- Data provided as a matrix specifying the fraction of time individuals identified as having
a home in a particular area spend in other areas.
- Locations (home and away) specify only which of the 30 "regional tourism organizations"
spread across the country the entry refers to.
- Although the data comes only from Spark subscribers, a correction has been applied to account
for non-uniform market share.
Host movement data
Combining host movement matrix with stochastic SIR model yields simulated
epidemics:
Questions
Is there evidence for multiple
introductions during a single season?
Does population
structure influence the pathogen evolution?
If so, is this structure correlated with
what mobile phones tell us of host movements?
Phylogeography of seasonal epidemics
Potential for phylogeographic analyses
- Each genome tagged with location down to DHB-level.
- Have evaluated two distinct phylogeographic inference methods:
- Discrete trait phylogeography ("mugration") model. [Lemey, Rambaut and Drummond, 2009]
- Structured coalescent model.
[Hudson, 1990; Notohara, 1990]
- Under SC model, pure genomic data contains evidence for
North Island/South Island structure.
Neither of these models deal explicitly
with the epidemiological dynamics: to do!
SC results (North/South, 2012)
SC results (North/South, 2013)
SC results (North/South, 2014)
Detecting multiple introductions
Qualitative evidence
Summary tree lineage count at the season start
is a rough estimate for the
introduction count.
Semi-quantitative evidence
Use distribution of sampled tree lineage counts
at season start as proxy for intro. count
posterior.
North/South/World model
Use structured coalescent model to
explicitly model introductions using an additional
World deme.
Posterior for 2012 introductions
Incorporating host movement data
How does host movement data fit in?
- Structured coalescent analyses with uninformative
priors are usually limited to approximately 10 demes.
- Can use the movement data to fix the rate matrix (up to a multiplicative constant).
- Drastically improves computational burden of sampling combined parameter/multi-type tree space.
Movement matrix transformation
Movement matrix transformation
We transform the matrix using a simple reweighting:
$$ r_{ab} = \sum_{ij} f_{ia} f_{jb} r_{ij} $$
where $f_{ia}$ is the fraction of RTO $i$ in DHB $a$.
This basis transformation introduces additional noise.
The forward-time rate matrix is then transformed into
the following backward-time migration matrix for ancestral lineages
in the SC model:
$$ m_{ab} = r_{ba}N_{b}/N_{a} $$
Ancestral locations inferred using phone data
How well do host movements fit the genetic data?
Can express the log BF as the following definite integral
$$\text{logBF} = \int_0^1 E_{\beta}[U]d\beta$$
where $U = \log P(D,\theta|M_1) - \log P(D,\theta|M_2)$
and $E_{\beta}[U]$ is
the expected value of $U$ under the product distribution
$$P(D,\theta|M_1)^{\beta}P(D,\theta|M_2)^{1-\beta}$$
How well do host movements fit the genetic data?
Summary
- Population structure, particularly the North/South
island split, shapes the seasonal evolutionary dynamic of H3N2.
- There does seem to be strong evidence for multiple
introductions within a single season.
- Mobile phone-derived host movement data is well
supported as a model for national H3N2 structue.
Outlook
- Compare mobile phone model with simple distance matrix-based model.
- Include a "World" population in analysis to allow
more defensible ancestral location inferences.
- Acquire more sequences. ESR has collected
thousands of isolates as part of the CDC SHIVERS
project.
- Acquire finer-resolution host movement information
and in an appropriate basis.
Acknowledgements
- Genomic data was provided by Richard Hall and Sue Huang at ESR.
- Generous donation of aggregated phone movement data:
- Summer internship students:
- André Lichtsteiner
- John Mlyahilu