QM7bDatasetΒΆ

class dgl.data.QM7bDataset(raw_dir=None, force_reload=False, verbose=False, transform=None)[source]ΒΆ

Bases: dgl.data.dgl_dataset.DGLDataset

QM7b dataset for graph property prediction (regression)

This dataset consists of 7,211 molecules with 14 regression targets. Nodes means atoms and edges means bonds. Edge data β€˜h’ means the entry of Coulomb matrix.

Reference: http://quantum-machine.org/datasets/

Statistics:

  • Number of graphs: 7,211

  • Number of regression targets: 14

  • Average number of nodes: 15

  • Average number of edges: 245

  • Edge feature size: 1

Parameters
  • raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/

  • force_reload (bool) – Whether to reload the dataset. Default: False

  • verbose (bool) – Whether to print out progress information. Default: True.

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

num_tasksΒΆ

Number of prediction tasks

Type

int

num_labelsΒΆ

(DEPRECATED, use num_tasks instead) Number of prediction tasks

Type

int

Raises

UserWarning – If the raw data is changed in the remote server by the author.

Examples

>>> data = QM7bDataset()
>>> data.num_tasks
14
>>>
>>> # iterate over the dataset
>>> for g, label in data:
...     edge_feat = g.edata['h']  # get edge feature
...     # your code here...
...
>>>
__getitem__(idx)[source]ΒΆ

Get graph and label by index

Parameters

idx (int) – Item index

Returns

Return type

(dgl.DGLGraph, Tensor)

__len__()[source]ΒΆ

Number of graphs in the dataset.

Returns

Return type

int