GINDataset

class dgl.data.GINDataset(name, self_loop, degree_as_nlabel=False, raw_dir=None, force_reload=False, verbose=False, transform=None)[source]

Bases: dgl.data.dgl_dataset.DGLBuiltinDataset

Dataset Class for How Powerful Are Graph Neural Networks?.

This is adapted from https://github.com/weihua916/powerful-gnns/blob/master/dataset.zip.

The class provides an interface for nine datasets used in the paper along with the paper-specific settings. The datasets are 'MUTAG', 'COLLAB', 'IMDBBINARY', 'IMDBMULTI', 'NCI1', 'PROTEINS', 'PTC', 'REDDITBINARY', 'REDDITMULTI5K'.

If degree_as_nlabel is set to False, then ndata['label'] stores the provided node label, otherwise ndata['label'] stores the node in-degrees.

For graphs that have node attributes, ndata['attr'] stores the node attributes. For graphs that have no attribute, ndata['attr'] stores the corresponding one-hot encoding of ndata['label'].

Parameters
  • name (str) – dataset name, one of ('MUTAG', 'COLLAB', 'IMDBBINARY', 'IMDBMULTI', 'NCI1', 'PROTEINS', 'PTC', 'REDDITBINARY', 'REDDITMULTI5K')

  • self_loop (bool) – add self to self edge if true

  • degree_as_nlabel (bool) – take node degree as label and feature if true

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

num_classes

Number of classes for multiclass classification

Type

int

Examples

>>> data = GINDataset(name='MUTAG', self_loop=False)

The dataset instance is an iterable

>>> len(data)
188
>>> g, label = data[128]
>>> g
Graph(num_nodes=13, num_edges=26,
      ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'attr': Scheme(shape=(7,), dtype=torch.float32)}
      edata_schemes={})
>>> label
tensor(1)

Batch the graphs and labels for mini-batch training

>>> graphs, labels = zip(*[data[i] for i in range(16)])
>>> batched_graphs = dgl.batch(graphs)
>>> batched_labels = torch.tensor(labels)
>>> batched_graphs
Graph(num_nodes=330, num_edges=748,
      ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'attr': Scheme(shape=(7,), dtype=torch.float32)}
      edata_schemes={})
__getitem__(idx)[source]

Get the idx-th sample.

Parameters

idx (int) – The sample index.

Returns

The graph and its label.

Return type

(dgl.Graph, Tensor)

__len__()[source]

Return the number of graphs in the dataset.