Parallelization
EPSIE samplers can be parallelized over multiple cores and even (via MPI)
multiple CPUs. In order to do so, you create a pool
object and pass that
to the sampler on initialization. The pool object must have a map
method
that has the same API as the standard Python map()
function.
Overview
Parallelization occurs over multiple chains. After a sampler has been setup
with some number of Markov chains, you evolve the chains for a given number of
iterations by calling the sampler’s
run()
method. The run method uses the
pool
that was provided to the sampler to split up the chains over child
processes. Each child process gets a subset of chains, and is told to iterate
the chains for the desired number of iterations by the parent process. When
the children processes have finished iterating their chains, they send the
results (positions, proposal states, etc.) back to the parent process. How many
chains a child process gets is determined by the pool being used (not by
EPSIE), but, roughly, is the number of chains divided by the number of
processes.
No other communication occurs between the parent process and the child processes while chains are being iterated. This differs from ensemble samplers, which typically pass information between children and parent processes on each iteration of the sampler.
Children processes evolve each chain by the number of iterations passed to the
run command before moving on to the next chain. For example, if a child process
is told to evolve 4 chains for 100 iterations, it will
step()
the first chain 100 times, then move
to the second chain, etc. For this reason, samplers should not be interrupted
while the run()
method is being
executed. Instead, it is best to run samplers for successive short periods of
time, with results checkpointed in between.
For the ParallelTemperedSampler
, all
temperatures of a given chain are grouped together via the
ParallelTemperedChain
. Temperatures are not
split up over processes. Instead, all temperatures for a given chain are
evolved together on each iteration. For example, if a child process receives
N chains, each with K temperatures, it will
step()
each temperature of the first chain,
perform any temperature swaps, then repeat for the same chain by the requested
number of iterations. It will then move on to the next chain.
For a detailed example see the Example of creating and running a Metropolis-Hastings Sampler and the Example of creating and running a Parallel Tempered Sampler tutorials.
Using Python’s multiprocessing
If you are using shared memory, the easiest way to parallelize is to use
Python’s multiprocessing
module. For example, if you wish to use
N
cores:
import multiprocessing
pool = multiprocessing.Pool(N)
You then pass the pool
object to the sampler’s pool
keyword argument
when initializing it:
samlper = MetropolisHastingsSampler(params, model, nchains, pool=pool)
Using MPI
To use parallelize over multiple CPUs (not shared memory) you will need to use
some implementation of
MPI, such as
Open MPI. To use within Python, we recommend
using a combination of mpi4py and
schwimmbad, both of which can
be installed via pip
. For example, to use N
processes, you would
create the pool by doing:
import schwimmbad
pool = schwimmbad.choose_pool(mpi=True, processes=N)
This pool
object can be passed to the sampler. You would then run your
Python script using your installation of MPI, e.g., mpirun.
For more information, see the documentation for these packages. A more feature-rich example of setting up an MPI pool can be found in the PyCBC suite’s pool module.