29
Jul
2010
dhinkle9

MrBayes

MrBayes (Ronquist and Huelsenbeck 2003) is a program for doing Bayesian phylogenetic analysis. The program uses Markov Chain Monte Carlo (MCMC) techniques to sample from the posterior probability distribution. By default, MrBayes uses Metropolis-coupling to accelerate convergence; specifically, three "heated" chains are run in parallel with each regular "cold" chain. The heated chains sample from distributions obtained by raising the posterior probability with some factor smaller than 1, resulting in flattening ("melting") of the peaks in the landscape defined by the posterior distribution. At specified intervals, the parameter values (the locations in the landscape) are swapped between cold and heated chains, which makes it possible for the cold chain to escape local peaks. This comes at the cost of increased computational complexity (switch Metropolis coupling off if you feel lucky and want fast analyses) but can really help convergence for difficult problems.

By default, MrBayes runs two independent analyses in parallel and calculates convergence diagnostics on the fly. This helps the user determine when to stop the analysis.

The last official release of the program is version 3.1.2, which dates from the fall of 2005. The most recent development version of the program (version 3.2) includes a number of significant new features, such as check-pointing, strict and relaxed clock models, and functionality for dating. Version 3.2 can be compiled from source code freely available on SourceForge (see instructions below). Participants of the Workshop on Molecular Evolution have access to a prerelease version of 3.2.

MrBayes version 3.1.2 is available for Windows, OSX, and Linux. A precompiled Windows executable of version 3.2 is provided below. All versions use a command-line interface and the program looks virtually the same on all platforms. However, the UNIX version tends to be slightly faster, and is recommended if you have a choice.

To get started with MrBayes, specifically version 3.2, we suggest you follow this tutorial. It includes a one-page quick-start tutorial, in addition to more detailed descriptions of example analyses. For these example analyses, you will also need the files primates.nex and cynmix.nex (see below). The instructions below give you an alternative overview of the program and its features.

Documentation: 

For detailed instructions see the following documentation:

For additional information on MrBayes and Bayesian phylogenetic inference, refer to these resources on MrBayes.

References: 

Example Data Files

Example Command File

The co1.nex file contains a sample MrBayes block at the end of the file that defines site-specific rates for first, second, and third codon positions.

The mb_part_block file above provides an example of a MrBayes block for analyzing multiple data partitions at the same time. In this file Fib5 is non-coding nDNA, tRNA, and ATPase, ND2 and ND3 are mtDNA. To use this block in MrBayes first execute your data file, then execute the block file. NOTE: you have to make sure that you have predefined your character sets and partitions according to your own data set.

Input Format: 
Instructions for All: 

To start MrBayes in the UNIX prompt window, type:

mb

MrBayes input files are in NEXUS format. At the MrBayes prompt type:

execute filename.nex

to enter a data file into the program. To set the desired nucleotide substitution model parameters type:

lset < parameter > = < option > < parameter > = < option >

but exchange the words in < > with parameters and options. For example:

lset nst=6 rates=invgamma

sets the model to a general time reversible substitution model with 6 rate categories. A proportion of the sites are invariable, while the rates for the remaining sites are drawn from a scaled gamma distribution. More information on parameters and options are available by typing:

help

To set the MCMC parameters type:

mcmcp < parameter > = < value > < parameter > = < value >

and exchange the words in < > with parameters and values. For example:

mcmcp ngen=10000 nruns=2 nchains=4 printfreq=100 samplefreq=50

performs two independent, parallel MCMCMC analyses, each with four chains (three heated and one cold), for 10000 generations. Information about the likelihood of the chains will be reported to screen every 100 generations and parameter and tree samples will be saved to file every 50 generations.

Start the analysis by typing:

mcmc

During the MCMC analysis, MrBayes prints the sampled parameter values to one text file for each run (filename.nex.run1.p, filename.nex.run2.p, etc), and the sampled trees to another set of text files (filename.nex.run1.t etc). If you switch off warnings by issuing "set autoclose=yes nowarn=yes", MrBayes will silently overwrite existing files with those names, so make sure you do not lose samples from previous runs.

Run the analysis until the convergence diagnostic, the average standard deviation of split frequencies, reaches a value below 0.01 (or 0.05 if you are less concerned with accurate estimates of the probability of poorly supported groups). After the analysis is completed, the sump command is used to summarize the information in the parameter file(s) to screen.

sump relburnin=yes burninfrac=0.25

will discard the first 25 % of samples as burn-in, which is the default for the "mcmc" command. The command:

sumt relburnin=yes burninfrac=0.25

summarizes the tree samples in a similar way. Among other things, the "sumt" command produces a ".con" file containing the consensus tree, which can be opened in a tree drawing program, such as FigTree.

Instead of typing all commands at the command line, you can write a MrBayes block in your datafile, see examples under References.