EPSIE

EPSIE is a parallelized Markov chain Monte Carlo (MCMC) sampler for Bayesian inference. It is meant for problems with complicated likelihood topology, including multimodal distributions. It has support for both parallel tempering and nested transdimensional problems. It was originally developed for gravitational-wave parameter estimation, but can be used for any Bayesian inference problem requring a stochastic sampler.

EPSIE is in many ways similar to emcee and other bring-your-own-likelihood Python-based samplers. The primary difference from emcee is EPSIE is not an ensemble sampler; i.e., the Markov chains used by EPSIE do not attempt to share information between each other. Instead, to speed convergence, multiple jump proposal classes are offered that can be customized to the problem at hand. These include adaptive proposals that attempt to learn the shape of the distribution during a burn-in period. The user can also easily create their own jump proposals.

Installation

The easiest way to install EPSIE is via pip:

pip install epsie

If you wish to install from the latest source code instead, you can clone a copy of the repository from GitHub, at https://github.com/cdcapano/epsie.

Prerequisites

EPSIE requires Python 3.6 or later. The number of required dependencies for EPSIE is purposely kept to a minimum, to make for a light weight, quick to install package. You can find the required packages in the requirements.txt file. If you are installing from source, these can be installed via pip with:

pip install -r requirements.txt

Optional dependencies can be found in companion.txt. These are needed for running the tutorials and building documentation, but are not necessary for regular use.

Quick Start

Two samplers are provided: the MetropolisHastingsSampler and the ParallelTemperedSampler. Parallel tempering is useful for multi-modal distributions and for estimating Bayesian evidence, but is more computationally expensive.

To use either sampler, you provide a function that evaluates the likelihood and prior at given points. This function must take keyword arguments as input that map parameter names to values and return a tuple of the log likelihood and log prior at that point. The function may also return “blob” data, which is a dictionary of additional statistics that you would like to keep track of, but this is optional.

For example, the following sets up the Metropolis-Hastings sampler with 3 chains to sample a 2D normal distribution, with a prior uniform between [-5, 5) on each parameter:

# create the function to evaluate
from scipy import stats
def model(x, y):
    """Evaluate model at given point."""
    logp = stats.uniform.logpdf([x, y], loc=-5, scale=10).sum()
    logl = stats.norm.logpdf([x, y]).sum()
    return logl, logp

# set up the sampler
from epsie.samplers import MetropolisHastingsSampler
params = ['x', 'y']
nchains = 3
sampler = MetropolisHastingsSampler(params, model, nchains)

# set the start positions: we'll just draw from the prior
sampler.start_position = {
    'x': stats.uniform.rvs(size=nchains, loc=-5, scale=10),
    'y': stats.uniform.rvs(size=nchains, loc=-5, scale=10)
    }

# run for a few iterations
sampler.run(4)

# retrieve positions
positions = sampler.positions
# positions is numpy structured array with shape nchains x niterations
print(positions['x'])
print(positions['y'])

In this simple example we have used the default Normal proposal class for producing jumps, and have used only a single core.

To learn how to use more advanced features and how to interact with sampler data, see the notebooks provided in the tutorials.

Indices and tables