QM7bDataset

class dgl.data.QM7bDataset(raw_dir=None, force_reload=False, verbose=False, transform=None)[source]

Bases: dgl.data.dgl_dataset.DGLDataset

QM7b dataset for graph property prediction (regression)

This dataset consists of 7,211 molecules with 14 regression targets. Nodes means atoms and edges means bonds. Edge data ‘h’ means the entry of Coulomb matrix.

Reference: http://quantum-machine.org/datasets/

Statistics:

  • Number of graphs: 7,211

  • Number of regression targets: 14

  • Average number of nodes: 15

  • Average number of edges: 245

  • Edge feature size: 1

Parameters
  • raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/

  • force_reload (bool) – Whether to reload the dataset. Default: False

  • verbose (bool) – Whether to print out progress information. Default: True.

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

num_labels

Number of labels for each graph, i.e. number of prediction tasks

Type

int

Raises

UserWarning – If the raw data is changed in the remote server by the author.

Examples

>>> data = QM7bDataset()
>>> data.num_labels
14
>>>
>>> # iterate over the dataset
>>> for g, label in data:
...     edge_feat = g.edata['h']  # get edge feature
...     # your code here...
...
>>>
__getitem__(idx)[source]

Get graph and label by index

Parameters

idx (int) – Item index

Returns

Return type

(dgl.DGLGraph, Tensor)

__len__()[source]

Number of graphs in the dataset.

Returns

Return type

int