TreeCycleDataset

class dgl.data.TreeCycleDataset(tree_height=8, num_motifs=60, cycle_size=6, perturb_ratio=0.01, seed=None, raw_dir=None, force_reload=False, verbose=True, transform=None)[source]

Bases: dgl.data.dgl_dataset.DGLBuiltinDataset

TREE-CYCLES dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

This is a synthetic dataset for node classification. It is generated by performing the following steps in order.

  • Construct a balanced binary tree as the base graph.

  • Construct a set of cycle motifs.

  • Attach the motifs to randomly selected nodes of the base graph.

  • Perturb the graph by adding random edges.

  • Generate constant feature for all nodes, which is 1.

  • Nodes in the tree belong to class 0 and nodes in cycles belong to class 1.

Parameters
  • tree_height (int, optional) – Height of the balanced binary tree. Default: 8

  • num_motifs (int, optional) – Number of cycle motifs to use. Default: 60

  • cycle_size (int, optional) – Number of nodes in a cycle motif. Default: 6

  • perturb_ratio (float, optional) – Number of random edges to add in perturbation divided by the number of original edges in the graph. Default: 0.01

  • seed (integer, random_state, or None, optional) – Indicator of random number generation state. Default: None

  • raw_dir (str, optional) – Raw file directory to store the processed data. Default: ~/.dgl/

  • force_reload (bool, optional) – Whether to always generate the data from scratch rather than load a cached version. Default: False

  • verbose (bool, optional) – Whether to print progress information. Default: True

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access. Default: None

num_classes

Number of node classes

Type

int

Examples

>>> from dgl.data import TreeCycleDataset
>>> dataset = TreeCycleDataset()
>>> dataset.num_classes
2
>>> g = dataset[0]
>>> label = g.ndata['label']
>>> feat = g.ndata['feat']
__getitem__(idx)[source]

Gets the data object at index.

__len__()[source]

The number of examples in the dataset.