ciderpress.models.train

class ciderpress.models.train.MOLGP(kernels, settings, libxc_baseline=None, default_noise=0.03)

Gaussian process model for the exchange-correlation functional or its components.

Parameters:
  • kernels (list[DFTKernel]) – List of kernels that sum to the XC energy.

  • settings (DescParams or FeatureSettings) – Settings for the features. Specifies what the feature vector is.

  • libxc_baseline (str or None) – Additional baseline functional to evaluate using the libxc library.

  • default_noise (float) – Default noise hyperparameter, used if noise is not provided for a particular data point

add_reactions(rxn_list)

Store the reactions in rxn_list in the model. These will be used as training points when fit() is called. Note: For every mol_id present in the reactions, the corresponding system must have been added to the covariance dictionary via the store_mol_covs function.

Parameters:

rxn_list – List of tuples. In each tuple, element 0 is an integer mode (mode: 0 (X), 1 (C), or 2 (XC)), and element 1 is a reaction dict. Each dict should have

  • structs: struct codes list (If an element in structs is a tuple, the first tuple element is a structure code, and the second is an orbital code.)

  • counts: count codes list

  • energy: Energy of the reaction

  • unit: Eh per (rxn energy unit)

  • noise (optional): noise in Eh for the reaction

map(mapping_plans)

Map the MOLGP model to an Evaluator object that can efficiently evaluate the XC energy and which can also be serialized.

Returns:

Functions that map each

kernel to a MappedDFTKernel object

Return type:

list(callable)

set_control_points(X0T_list, reduce=True)

Transform raw descriptors X and use them as control points.

Parameters:

X – Numpy array of raw features

store_mol_covs(ddir, mol_ids, get_orb_deriv=None, get_correlation=True)

Reads features (and optionally their derivatives with respect to orbital occupations) and computes covariances with the control point sets of all the kernels of the models (or only the exchange components if get_correlation=False). Also stores the reference energy data in the dataset

Parameters:
  • ddir (str or dict[str]) – If using DescParams (old version), should be single directory with the feature vectors. If using FeatureSettings (new version), should be dictionary of directories, with ‘REF’, ‘SL’, ‘NLDF’, ‘NLOF’, ‘SDMX’, and ‘HYB’ being the keys. All are optional except ‘SL’ and ‘REF’. ‘HYB’ stands for reference data. It should contain reference energies and raw, spin-polarized density/gradient/kinetic energy data.

  • mol_ids (list[str]) – system ids to search for in ddir

  • get_orb_deriv (bool) – Whether to read the orbital occupation derivatives. If None, reads them iff they are available. If True, reads them and throws an error if they are not available. If False, does not read them.

  • get_correlation (bool) – Whether to compute covariance for the correlation kernels (True) or just the exchange kernels (False).

class ciderpress.models.train.MOLGP2(kernels, settings, libxc_baseline=None, default_noise=0.03)

Gaussian process model for the exchange-correlation functional or its components.

Same as MOLGP, except kernels should be of type DFTKernel2. This requires a new class because DFTKernel2 has a different approach to evaluating baseline functional contributions, so a few functions in MOLGP need to be modified.

Parameters:
  • kernels (list[DFTKernel2]) – List of kernels that sum to the XC energy.

  • settings (DescParams or FeatureSettings) – Settings for the features. Specifies what the feature vector is.

  • libxc_baseline (str or None) – Additional baseline functional to evaluate using the libxc library.

  • default_noise (float) – Default noise hyperparameter, used if noise is not provided for a particular data point

map(mapping_plans)

Map the MOLGP model to an Evaluator object that can efficiently evaluate the XC energy and which can also be serialized.

Returns:

Functions that map each

kernel to a MappedDFTKernel object

Return type:

list(callable)