<beast version='2.0' namespace='...'>
<run spec="MCMC" id="mcmc" chainLength="1000000000">
<state>
<stateNode spec='RealParameter' id="hky.kappa">1.0</stateNode>
<stateNode spec='RealParameter' id="popSize">1.0</stateNode>
<stateNode spec='ClusterTree' id='tree' clusterType='upgma'>
<taxa idref='alignment'/>
</stateNode>
</state>
<distribution spec="CompoundDistribution" id="posterior">
<distribution id="coalescent" spec="Coalescent">
<treeIntervals spec='TreeIntervals' id='TreeIntervals'>
<tree idref="tree"/>
</treeIntervals>
<populationModel spec="ConstantPopulation" id='ConstantPopulation'>
<popSize idref="popSize"/>
</populationModel>
</distribution>
<distribution spec='TreeLikelihood' id="treeLikelihood">
<data id="alignment" dataType="nucleotide">
<sequence taxon="human" value="AGAAAT..."/>
<sequence taxon="chimp" value="AGAAAT..."/>
<sequence taxon="bonobo" value="AGAAAT..."/>
</data>
<tree idref="tree"/>
<siteModel spec='SiteModel' id="siteModel">
<input name='substModel' idref='hky'/>
<substModel spec='HKY' id="hky">
<kappa idref='hky.kappa'/>
<frequencies id='freqs' spec='Frequencies'>
<data idref='alignment'/>
</substModel>
</siteModel>
</distribution>
</distribution>
<operator id='kappaScaler' spec='ScaleOperator' scaleFactor="0.5" weight="1">
<parameter idref="hky.kappa"/>
</operator>
<operator id='popSizeScaler' spec='ScaleOperator' scaleFactor="0.5" weight="1">
<parameter idref="popSize"/>
</operator>
<operator spec='SubtreeSlide' weight="5" gaussian="true" size="1.0">
<tree idref="tree"/>
</operator>
<logger logEvery="10000" fileName="$(filebase).log">
<log idref="hky.kappa"/>
</logger>
<logger logEvery="20000" fileName="$(filebase).trees">
<log idref="tree"/>
</logger>
<logger logEvery="10000">
...
</logger>
</run>
</beast>
XML file components:
<tag attributeOne="Attribute value"
attributeTwo="Another attribute value">
<childTag childAttribute="10"> </childTag>
<childTag childAttribute="20"/>
<!-- This is a "comment" -->
</tag>
There is a lot that one can say about XML, but this is all we need!
<parentInput spec="BEASTObject">
<input1 ...> </input1>
<input2 ...> </input2>
...
</parentInput>
<mcmc spec="MCMC"
chainLength="10000000"
storeEvery="10000"
sampleFromPrior="true">
...
</mcmc>
<parentInput spec="Normal">
<mean spec="RealParameter" value="1.0" lower="0.0" upper="5.0"/>
<sigma spec="RealParameter" value="0.5" lower="0.0" upper="5.0"/>
</parentInput>
<state>
<stateNode spec="RealParameter" value="1.0" id="clockRate"/>
</state>
...
<logger logEvery="1000" fileName="logfile.log">
<log idref="clockRate"/>
</logger>
<state>
<stateNode spec="RealParameter" value="1.0" id="clockRate"/>
</state>
...
<operator spec="ScaleOperator" parameter="@clockRate" weight="1"/>
<parentInput spec="RealParameter" value="1.0"/>
<parameter name="parentInput" value="1.0"/>
<beast version='2.0' namespace='...'>
<run spec="MCMC" id="mcmc" chainLength="1000000000">
<state>
<stateNode spec='RealParameter' id="hky.kappa">1.0</stateNode>
<stateNode spec='RealParameter' id="popSize">1.0</stateNode>
<stateNode spec='ClusterTree' id='tree' clusterType='upgma'>
<taxa idref='alignment'/>
</stateNode>
</state>
<distribution spec="CompoundDistribution" id="posterior">
<distribution id="coalescent" spec="Coalescent">
<treeIntervals spec='TreeIntervals' id='TreeIntervals'>
<tree idref="tree"/>
</treeIntervals>
<populationModel spec="ConstantPopulation" id='ConstantPopulation'>
<popSize idref="popSize"/>
</populationModel>
</distribution>
<distribution spec='TreeLikelihood' id="treeLikelihood">
<data id="alignment" dataType="nucleotide">
<sequence taxon="human" value="AGAAAT..."/>
<sequence taxon="chimp" value="AGAAAT..."/>
<sequence taxon="bonobo" value="AGAAAT..."/>
</data>
<tree idref="tree"/>
<siteModel spec='SiteModel' id="siteModel">
<input name='substModel' idref='hky'/>
<substModel spec='HKY' id="hky">
<kappa idref='hky.kappa'/>
<frequencies id='freqs' spec='Frequencies'>
<data idref='alignment'/>
</substModel>
</siteModel>
</distribution>
</distribution>
<operator id='kappaScaler' spec='ScaleOperator' scaleFactor="0.5" weight="1">
<parameter idref="hky.kappa"/>
</operator>
<operator id='popSizeScaler' spec='ScaleOperator' scaleFactor="0.5" weight="1">
<parameter idref="popSize"/>
</operator>
<operator spec='SubtreeSlide' weight="5" gaussian="true" size="1.0">
<tree idref="tree"/>
</operator>
<logger logEvery="10000" fileName="$(filebase).log">
<log idref="hky.kappa"/>
</logger>
<logger logEvery="20000" fileName="$(filebase).trees">
<log idref="tree"/>
</logger>
<logger logEvery="10000">
...
</logger>
</run>
</beast>
This tutorial covers a short series of small XML-hacking exercises:
You will be given approximately 15 minutes for each exercise, after which I will present the solution.
Modification of basic parameters of the MCMC and loggers are easy to do directly in the XML.
By default BEAST initializes the tree randomly in a way that is consistent with any topological constraints, but occasionally we need to provide a better starting tree by hand.
Occasionally we (think!) we know the tree topology perfectly. We can easily prevent the analysis from sampling distinct tree topologies:
(Results of previous exercise at http://tgvaughan.github.io/TTB_Lectures/XML/downloads/primates_ex2.xml )
Allowing different subsets of the data to share a model is useful/necessary. Here we experiment with linking clock models, but the approach translates to other models (e.g. migration).