BACommunityDataset

class dgl.data.BACommunityDataset(num_base_nodes=300, num_base_edges_per_node=4, num_motifs=80, perturb_ratio=0.01, num_inter_edges=350, seed=None, raw_dir=None, force_reload=False, verbose=True, transform=None)[source]

Bases: dgl.data.dgl_dataset.DGLBuiltinDataset

BA-COMMUNITY dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

This is a synthetic dataset for node classification. It is generated by performing the following steps in order.

  • Construct a base Barabási–Albert (BA) graph.

  • Construct a set of five-node house-structured network motifs.

  • Attach the motifs to randomly selected nodes of the base graph.

  • Perturb the graph by adding random edges.

  • Nodes are assigned to 4 classes. Nodes of label 0 belong to the base BA graph. Nodes of label 1, 2, 3 are separately at the middle, bottom, or top of houses.

  • Generate normally distributed features of length 10

  • Repeat the above steps to generate another graph. Its nodes are assigned to class 4, 5, 6, 7. Its node features are generated with a distinct normal distribution.

  • Join the two graphs by randomly adding edges between them.

Parameters
  • num_base_nodes (int, optional) – Number of nodes in each base BA graph. Default: 300

  • num_base_edges_per_node (int, optional) – Number of edges to attach from a new node to existing nodes in constructing a base BA graph. Default: 4

  • num_motifs (int, optional) – Number of house-structured network motifs to use in constructing each graph. Default: 80

  • perturb_ratio (float, optional) – Number of random edges to add to a graph in perturbation divided by the number of original edges in it. Default: 0.01

  • num_inter_edges (int, optional) – Number of random edges to add between the two graphs. Default: 350

  • seed (integer, random_state, or None, optional) – Indicator of random number generation state. Default: None

  • raw_dir (str, optional) – Raw file directory to store the processed data. Default: ~/.dgl/

  • force_reload (bool, optional) – Whether to always generate the data from scratch rather than load a cached version. Default: False

  • verbose (bool, optional) – Whether to print progress information. Default: True

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access. Default: None

num_classes

Number of node classes

Type

int

Examples

>>> from dgl.data import BACommunityDataset
>>> dataset = BACommunityDataset()
>>> dataset.num_classes
8
>>> g = dataset[0]
>>> label = g.ndata['label']
>>> feat = g.ndata['feat']
__getitem__(idx)[source]

Gets the data object at index.

__len__()[source]

The number of examples in the dataset.