Checkpointing
During high-dimensional Bayesian analysis, which is typically computationally heavy, one must frequently run sampling and/or evidence estimation for long periods of time.
One issue often encountered is that the computing facilities may go offline or timeout during this period, thus discarding any values computed already.
To avoid this issue harmonic
supports checkpointing, which allows the user to periodically save the progress of the computation, and then resume from the saved point (the checkpoint).
This interactive tutorial illustrates how one may use harmonic
with checkpointing.
[ ]:
%%capture
# Install packages
%pip install harmonic emcee corner getdist
[1]:
%%capture
# Import modules
import numpy as np
import harmonic as hm
import emcee
Basic problem setup
Define a Bayesian posterior function (here we’ll use a simple Gaussian example).
[2]:
def ln_posterior(x, inv_cov):
"""Compute log_e of n dimensional multivariate gaussian.
Args:
x: Position at which to evaluate prior.
Returns:
double: Value of posterior at x.
"""
return -np.dot(x,np.dot(inv_cov,x))/2.0
def init_cov(ndim):
"""Initialise random diagonal covariance matrix.
Args:
ndim: Dimension of Gaussian.
Returns:
cov: Covariance matrix of shape (ndim,ndim).
"""
cov = np.zeros((ndim,ndim))
diag_cov = np.ones(ndim)
np.fill_diagonal(cov, diag_cov)
return cov
Note that the final function init_cov
is used to randomly assign a diagonal covariance proportional to the identiy matrix.
Then define parameters for `emcee
<https://emcee.readthedocs.io/en/stable/>`__ and the covariance of our Gaussian example.
[3]:
# Define parameters for emcee sampling
ndim = 10 # number of dimensions
nchains = 200 # total number of chains to compute
samples_per_chain = 5000 # number of samples per chain
nburn = 2000 # number of samples to discard as burn in
# initialise random seed
np.random.seed(1)
# Create covariance matrix
cov = init_cov(ndim)
inv_cov = np.linalg.inv(cov)
Let’s use the harmonic
rational quadratic spline model with default parameters.
[4]:
# Instantiate model
model = hm.model.RQSplineModel(ndim)
# How many epochs model will be trained for
epochs_num = 4
Checkpointing
Now we need to run the sampler to collect samples but we wish to checkpoint periodically to protect against system crashes or timeouts. One simple way to do this is to execute the following loop.
[5]:
# Set initial random position and state
pos = np.random.rand(ndim * nchains).reshape((nchains, ndim)) * 0.1
rstate = np.random.get_state()
# Define how often to checkpoint the evidence class
chain_iterations = 5
# Discard burn in
sampler = emcee.EnsembleSampler(nchains, ndim, ln_posterior, args=[inv_cov])
(pos, prob, rstate) = sampler.run_mcmc(pos, nburn, rstate0=rstate)
for chain_i in range(chain_iterations):
# Run the emcee sampler from previous endpoint
sampler = emcee.EnsembleSampler(nchains, ndim, ln_posterior, args=[inv_cov])
(pos, prob, rstate) = sampler.run_mcmc(pos, int(samples_per_chain/chain_iterations), rstate0=rstate)
# Collect and format samples
samples = sampler.chain
lnprob = sampler.lnprobability
chains = hm.Chains(ndim)
chains.add_chains_3d(samples, lnprob)
if chain_i == 0:
# Fit model on first set of samples
model.fit(samples.reshape((-1,ndim)), epochs = epochs_num, verbose = True)
# Instantiate the evidence class
cal_ev = hm.Evidence(nchains, model)
# Serialize the Evidence Class
cal_ev.serialize(".temp.gaussian_example_{}.dat".format(ndim))
# After model is trained, start calculating the evidence
else:
# 1) Deserialize the Evidence class
cal_ev = hm.Evidence.deserialize(".temp.gaussian_example_{}.dat".format(ndim))
# 2) Add these new chains to Evidence class
cal_ev.add_chains(chains)
# 3) Serialize the Evidence Class
cal_ev.serialize(".temp.gaussian_example_{}.dat".format(ndim))
# Clear memory
del chains, samples, lnprob, sampler, prob
Training NF: 100%|██████████| 4/4 [01:45<00:00, 26.27s/it]
Of course, it is not typically necessary to deserialize the Evidence
class following each checkpoint but only if execution halts.
Compute the Bayesian evidence
Finally we simply compute the learnt harmonic mean estimator.
[6]:
ln_evidence = - cal_ev.ln_evidence_inv
Results
For the simple Gaussian example considered, it’s also straightforward to compute the evidence analytically.
[7]:
def ln_analytic_evidence(ndim, cov):
"""Compute analytic ln_e evidence.
Args:
ndim: Dimensionality of the multivariate Gaussian posterior
cov: Covariance matrix dimension nxn.
Returns:
double: Value of posterior at x.
"""
ln_norm_lik = -0.5*ndim*np.log(2*np.pi)-0.5*np.log(np.linalg.det(cov))
return -ln_norm_lik
Let’s also compute the evidence analytically so that we can compare it to the value computed by harmonic
.
[8]:
ln_evidence_analytic = ln_analytic_evidence(ndim, cov)
Let’s compare the value computed by harmonic
and analytically.
[9]:
ln_evidence_err_neg, ln_evidence_err_pos = cal_ev.compute_ln_inv_evidence_errors()
ln_evidence_err_avg = (ln_evidence_err_pos - ln_evidence_err_neg) / 2.0
print(f"ln_evidence_err_neg = {ln_evidence_err_neg}")
print(f"ln_evidence_err_pos = {ln_evidence_err_pos}")
print(f"ln_evidence_err_avg = {ln_evidence_err_avg}")
ln_evidence_err_neg = -0.0031277022888513755
ln_evidence_err_pos = 0.0031179502607592245
ln_evidence_err_avg = 0.0031228262748053
[10]:
print('ln_evidence (harmonic) = {} (+ {} / {})'.format(ln_evidence, ln_evidence_err_pos, ln_evidence_err_neg))
print('ln_evidence (analytic) = {}'.format(ln_evidence_analytic))
print('nsigma = {}'.format(np.abs(ln_evidence - ln_evidence_analytic) / ln_evidence_err_avg))
ln_evidence (harmonic) = 9.186614990234375 (+ 0.0031179502607592245 / -0.0031277022888513755)
ln_evidence (analytic) = 9.189385332046726
nsigma = 0.887126458074755
As expected, the evidence computed by harmonic
is close to that computed analytically.