Parallelization

EPSIE samplers can be parallelized over multiple cores and even (via MPI) multiple CPUs. In order to do so, you create a pool object and pass that to the sampler on initialization. The pool object must have a map method that has the same API as the standard Python map() function.

Overview

Parallelization occurs over multiple chains. After a sampler has been setup with some number of Markov chains, you evolve the chains for a given number of iterations by calling the sampler’s run() method. The run method uses the pool that was provided to the sampler to split up the chains over child processes. Each child process gets a subset of chains, and is told to iterate the chains for the desired number of iterations by the parent process. When the children processes have finished iterating their chains, they send the results (positions, proposal states, etc.) back to the parent process. How many chains a child process gets is determined by the pool being used (not by EPSIE), but, roughly, is the number of chains divided by the number of processes.

No other communication occurs between the parent process and the child processes while chains are being iterated. This differs from ensemble samplers, which typically pass information between children and parent processes on each iteration of the sampler.

Children processes evolve each chain by the number of iterations passed to the run command before moving on to the next chain. For example, if a child process is told to evolve 4 chains for 100 iterations, it will step() the first chain 100 times, then move to the second chain, etc. For this reason, samplers should not be interrupted while the run() method is being executed. Instead, it is best to run samplers for successive short periods of time, with results checkpointed in between.

For the ParallelTemperedSampler, all temperatures of a given chain are grouped together via the ParallelTemperedChain. Temperatures are not split up over processes. Instead, all temperatures for a given chain are evolved together on each iteration. For example, if a child process receives N chains, each with K temperatures, it will step() each temperature of the first chain, perform any temperature swaps, then repeat for the same chain by the requested number of iterations. It will then move on to the next chain.

For a detailed example see the Example of creating and running a Metropolis-Hastings Sampler and the Example of creating and running a Parallel Tempered Sampler tutorials.

Using Python’s multiprocessing

If you are using shared memory, the easiest way to parallelize is to use Python’s multiprocessing module. For example, if you wish to use N cores:

import multiprocessing
pool = multiprocessing.Pool(N)

You then pass the pool object to the sampler’s pool keyword argument when initializing it:

samlper = MetropolisHastingsSampler(params, model, nchains, pool=pool)

Using MPI

To use parallelize over multiple CPUs (not shared memory) you will need to use some implementation of MPI, such as Open MPI. To use within Python, we recommend using a combination of mpi4py and schwimmbad, both of which can be installed via pip. For example, to use N processes, you would create the pool by doing:

import schwimmbad
pool = schwimmbad.choose_pool(mpi=True, processes=N)

This pool object can be passed to the sampler. You would then run your Python script using your installation of MPI, e.g., mpirun.

For more information, see the documentation for these packages. A more feature-rich example of setting up an MPI pool can be found in the PyCBC suite’s pool module.