harmonic.utils.cross_validation(chains, domains: ~typing.List, hyper_parameters: ~typing.List, nfold=2, modelClass=<class 'harmonic.model_legacy.KernelDensityEstimate'>, seed: int = -1) List

Perform n-fold validation for given model using chains to be split into validation and training data.

First, splits data into nfold chunks. Second, fits the model using each of the hyper-parameters given using all but one of the chunks (the validation chunk). This procedure is performed for all the chunks and the average (mean) log-space variance from all the chunks is computed and returned. This can be used to decide which hyper-parameters list was better.

Parameters:
  • chains (Chains) – Chains containing samples (to be split into training and validation data herein).

  • domains (List) – Domains of the model’s parameters.

  • hyper_parameters (List) – List of hyper_parameters where each entry is a hyper_parameter list to be considered.

  • modelClass (Model) – Model that is being cross validated (default = KernelDensityEstimate).

  • seed (int) – Seed for random number generator when drawing the chains (if this is negative the seed is not set).

Returns:

Mean log validation variance (averaged over nfolds) for each hyper-parameter.

Return type:

(List)

Raises:

ValueError – Raised if model is not one of the posible models.

harmonic.utils.eval_func_on_grid(func, xmin, xmax, ymin, ymax, nx, ny)

Evalute 2D function on a grid.

Parameters:
  • func (-) – Function to evalate.

  • xmin (-) – Minimum x value to consider in grid domain.

  • xmax (-) – Maximum x value to consider in grid domain.

  • ymin (-) – Minimum y value to consider in grid domain.

  • ymax (-) – Maximum y value to consider in grid domain.

  • nx (-) – Number of samples to include in grid in x direction.

  • ny (-) – Number of samples to include in grid in y direction.

Returns:

Function values evaluated on the 2D grid. - x_grid:

x values over the 2D grid.

  • y_grid:

    y values over the 2D grid.

Return type:

  • func_eval_grid

harmonic.utils.plot_getdist(samples, labels=None)

Plot triangle plot of marginalised distributions using getdist package.

Parameters:
  • samples (-) – 2D array of shape (ndim, nsamples) containing samples.

  • labels (-) – Array of strings containing axis labels.

Returns:

  • None

harmonic.utils.plot_getdist_compare(samples1, samples2, labels=None, fontsize=17, legend_fontsize=15)

Plot triangle plot of marginalised distributions using getdist package.

Parameters:
  • samples1 – 2D array of shape (ndim, nsamples) containing samples from the posterior.

  • samples2 – 2D array of shape (ndim, nsamples) containing samples from the concentrated flow.

  • labels – Array of strings containing axis labels for both sets of samples.

  • fontsize – Plot fontsize.

  • legend_fontsize – Plot legend fontsize.

Returns:

None

harmonic.utils.split_data(chains, training_proportion: float = 0.5) Tuple

Split the data in a chains instance into two (e.g. training and test sets).

New chains instances can be used for training and calculation the evidence on the “test” set.

Chains are split so that the first chains in the original chains object go into the training set and the following go into the test set.

Parameters:
  • chains (Chains) – Instance of a chains class containing the data to be split.

  • training_proportion (float) – Proportion of data to be used in training (default=0.5)

Returns:

A tuple containing the following two Chains.

chains_train (Chains): Instance of a chains class containing

chains to be used to fit the model (e.g. training).

chains_test (Chains): Instance of a chains class containing

chains to be used to calculate the evidence (e.g. testing).

Return type:

(Chains, Chains)

Raises:
  • ValueError – Raised if training_proportion is not strictly between 0 and 1.

  • ValueError – Raised if resulting nchains in training set is less than 1.

  • ValueError – Raised if resulting nchains in test set is less than 1.

harmonic.utils.validation_fit_indexes(i_fold: int, nchains_in_val_set: int, nfold: int, indexes) Tuple[List, List]

Extract the correct indexes for the chains of the validation and training sets.

Parameters:
  • i_fold (int) – Cross-validation iteration to perform.

  • nchains_in_val_set (int) – The number of chains that will go in each validation set.

  • nfold (int) – Number of fold validation sets to be made.

  • indexes (List) – List of the chains to be used in fold validation that need to be split.

Returns:

A tuple containing the following two lists of indices.

indexes_val (List): List of indexes for the validation set.

indexes_fit (List): List of indexes for the training set.

Return type:

(List, List)

Raises:

ValueError – Raised if the value of i_fold does not fall between 0 and nfold-1.