sempler.LGANM

The sempler.LGANM class allows to define and sample from linear SCMs with Gaussian additive noise (i.e. a Gaussian Bayesian network).

Additionally, the LGANM class allows sampling “in the population setting”, i.e. by returning a symbolic gaussian distribution, sempler.NormalDistribution, which allows for manipulation such as conditioning, marginalization and regression in the population setting.

The SCM is represented by the connectivity (weights) matrix and the noise term means and variances. The underlying graph is assumed to be acyclic.

class sempler.LGANM(W, means, variances, random_state=None)

Represents a linear model with Gaussian additive noise.

Parameters:

W (array_like) – Connectivity (weights) matrix representing a DAG.
variances (numpy.ndarray or tuple) – The variances of the noise terms, or a tuple representing the lower/upper bounds to sample them from a uniform distribution.
means (numpy.ndarray or tuple) – The means of the noise terms, or a tuple representing the lower/upper bounds to sample them from a uniform distribution.

Raises:

ValueError – If the given connectivity does not correspond to a DAG.

Examples

Constructing a linear Gaussian SCM.

>>> import sempler
>>> import numpy as np

Define the connectivity matrix:

>>> W = np.array([[0, 0, 0, 0.1, 0],
...               [0, 0, 2.1, 0, 0],
...               [0, 0, 0, 3.2, 0],
...               [0, 0, 0, 0, 5.0],
...               [0, 0, 0, 0, 0  ]])

(2a) With explicit means and variances:

>>> means = np.array([0,1,2,3,4])
>>> variances = np.array([1,1,1,1,1])
>>> lganm = sempler.LGANM(W, means, variances)

(2b) With randomly sampled means and variances:

>>> lganm = sempler.LGANM(W, (0,1), (0,1))

An exception is thrown when the connectivity matrix does not correspond to a DAG:

>>> A = [[0,1,0],[0,0,1],[1,0,0]]
>>> sempler.LGANM(A, (0,0), (1,1))
Traceback (most recent call last):
  ...
ValueError: The given graph is not a DAG.

W

Connectivity (weights) matrix representing a DAG.

Type:: array_like

variances

The variances of the noise terms.

Type:: numpy.ndarray

means

The means of the noise terms.

Type:: numpy.ndarray

p

The number of variables (size) of the SCM.

Type:: int

sample(n=100, population=False, do_interventions={}, shift_interventions={}, noise_interventions={}, random_state=None)

Generates n observations from the linear Gaussian SCM, under the given do, shift or noise interventions. If none are given, sample from the observational distribution.

Parameters:

n (int, optional) – The size of the sample (i.e. number of observations). Defaults to 100.
population (bool, optional) – If True, the function returns a symbolic normal distribution instead of samples (see sempler.NormalDistribution). Defaults to False.
do_interventions (dict, optional) – A dictionary where keys correspond to the intervened variables, and the values are tuples representing the new mean/variance of the intervened variable, e.g. {1: (1,2)}.
shift_interventions (dict, optional) – A dictionary where keys correspond to the intervened variables, and the values are tuples representing the mean/variance of the noise which is added to the intervened variables, e.g. {1: (1,2)}.
noise_interventions (dict, optional) – A dictionary where keys correspond to the intervened variables, and the values are tuples representing the mean/variance of the new noise, e.g. {1: (1,2)}.
random_state (int, optional) – To set the random state for reproducibility. Succesive calls with the same random state will return the same sample.

Returns:

An array containing the sample, where each column corresponds to a variable; or, if population=True, a symbolic normal distribution (see sempler.NormalDistribution).

Return type:

numpy.ndarray or sempler.NormalDistribution

Examples

Sampling the observational environment in the “population setting”

>>> distribution = lganm.sample(population = True)

Sampling under a shift intervention on variable 1 with standard gaussian noise

>>> samples = lganm.sample(100, shift_interventions = {1: (0,1)})

Sampling under a noise intervention on variable 0 and a do intervention on variable 2:

>>> samples = lganm.sample(100,
...                       noise_interventions = {0: (1,2)},
...                       do_interventions = {2 : (3,4)})

Interventions can also be deterministic, i.e. setting a variable/noise term to a fixed value:

>>> samples = lganm.sample(5, do_interventions = {2 : (99,0)})
>>> samples[:,2]
array([99., 99., 99., 99., 99.])