
This module contains functions to generate random graphs, which can then be used together with sempler.ANM or sempler.LGANM to produce random SCMs.

sempler.generators.dag_avg_deg(p, k, w_min=1, w_max=1, return_ordering=False, random_state=None, debug=False)

Generate an Erdos-Renyi graph with p nodes and average degree k, and orient edges according to a random ordering. Sample the edge weights from a uniform distribution.

  • p (int) – The number of nodes in the graph.

  • k (float) – The desired average degree.

  • w_min (float, optional) – The lower bound on the sampled weights. Defaults to 1.

  • w_max (float, optional) – The upper bound on the sampled weights. Defaults to 1.

  • return_ordering (bool, optional) – If the topological ordering used to orient the edges should be returned.

  • random_state (int,optional) – To set the random state for reproducibility.

  • debug (bool, optional) – If debug traces should be printed.


  • W (numpy.ndarray) – The connectivity (weights) matrix of the generated DAG.

  • ordering (numpy.ndarray, optional) – If return_ordering = True, a topological ordering of the graph.


>>> from sempler.generators import dag_avg_deg
>>> dag_avg_deg(5, 2, random_state = 42)
array([[0., 0., 1., 1., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0.]])

Optionally, the ordering used to orient the edges can be returned

>>> dag_avg_deg(5, 2, return_ordering = True, random_state = 42)
(array([[0., 0., 1., 1., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0.]]), array([0, 3, 1, 4, 2]))
sempler.generators.dag_full(p, w_min=1, w_max=1, return_ordering=False, random_state=None)

Create a fully connected DAG, sampling the weights from a uniform distribution.

  • p (int) – The number of nodes in the graph.

  • w_min (float, optional) – The lower bound on the sampled weights. Defaults to 1.

  • w_max (float, optional) – The upper bound on the sampled weights. Defaults to 1.

  • return_ordering (bool, optional) – If the topological ordering used to orient the edges should be returned.

  • random_state (int,optional) – To set the random state for reproducibility.

  • debug (bool, optional) – If debug traces should be printed.


  • W (numpy.ndarray) – The connectivity (weights) matrix of the generated DAG.

  • ordering (numpy.ndarray, optional) – If return_ordering = True, a topological ordering of the graph.


>>> from sempler.generators import dag_full
>>> dag_full(4, random_state = 42)
array([[0., 0., 1., 1.],
       [1., 0., 1., 1.],
       [0., 0., 0., 0.],
       [0., 0., 1., 0.]])

Optionally, the ordering used to orient the edges can be returned

>>> dag_full(4, return_ordering = True, random_state = 42)
(array([[0., 0., 1., 1.],
       [1., 0., 1., 1.],
       [0., 0., 0., 0.],
       [0., 0., 1., 0.]]), array([1, 0, 3, 2]))
sempler.generators.intervention_targets(p, K, size, replace=True, random_state=None)

Sample a set of intervention targets.

  • p (int) – The number of variables, i.e. targets will be sampled from [0,p-1].

  • K (int) – The total number of interventions.

  • size (int or tuple) – Specifies the size of each intervention, i.e. the number of targets / intervention. If a two-element tuple, the number of targets is sampled uniformly at random from [size[0], size[1]].

  • replace (bool, default=True) – Wether the intervention targets should be sampled with replacement, i.e. if repeated targets are allowed across environments.

  • random_state (int or None) – To set the random state for reproducibility.


interventions – The sampled intervention targets.

Return type:

list of list of int


ValueError : – If the size of each intervention (i.e. number of targets) is larger than the actual number of variables, or if the tuple passed as size does not have length 2.


Generating a set of single-variable interventions:

>>> from sempler.generators import intervention_targets
>>> intervention_targets(10, 5, 1, random_state=42)
[[0], [7], [6], [4], [4]]

Without replacement:

>>> intervention_targets(10, 5, 1, replace=False, random_state=42)
[[0], [7], [6], [4], [3]]

Generating a set of interventions with random number of targets:

>>> intervention_targets(10, 5, (1,3), random_state=42)
[[8], [2, 6, 0], [8, 7], [6, 7], [8, 1]]

Without replacement:

>>> intervention_targets(10, 5, (1,2), replace=False, random_state=42)
[[8], [6, 0], [1, 4], [7], [9]]

An exception is raised if size > p:

>>> intervention_targets(4, 5, 5)
Traceback (most recent call last):
ValueError: The (max.) intervention size cannot be larger than the number of variables.

Or if size is a tuple with size different than two:

>>> intervention_targets(4, 5, (0,1,2))
Traceback (most recent call last):
ValueError: The intervention size must be a positive integer or two-element tuple.

If sampling targets without replacement, the maximum intervention size and number of interventions must be set accordingly, i.e. max_size x K <= p. Otherwise an exception is raised:

>>> intervention_targets(10, 5, (0,3), replace=False)
Traceback (most recent call last):
ValueError: Cannot sample targets without replacement for the given intervention size and number of interventions.