QM9EdgeDataset¶
-
class
dgl.data.
QM9EdgeDataset
(label_keys=None, raw_dir=None, force_reload=False, verbose=True, transform=None)[source]¶ Bases:
dgl.data.dgl_dataset.DGLDataset
QM9Edge dataset for graph property prediction (regression)
This dataset consists of 130,831 molecules with 19 regression targets. Nodes correspond to atoms and edges correspond to bonds.
- This dataset differs from
QM9Dataset
in the following aspects: It includes the bonds in a molecule in the edges of the corresponding graph while the edges in
QM9Dataset
are purely distance-based.It provides edge features, and node features in addition to the atoms’ coordinates and atomic numbers.
It provides another 7 regression tasks(from 12 to 19).
This class is built based on a preprocessed version of the dataset, and we provide the preprocessing datails here.
Reference:
For Statistics:
Number of graphs: 130,831.
Number of regression targets: 19.
Node attributes:
pos: the 3D coordinates of each atom.
attr: the 11D atom features.
Edge attributes:
edge_attr: the 4D bond features.
Regression targets:
Keys
Property
Description
Unit
mu
\(\mu\)
Dipole moment
\(\textrm{D}\)
alpha
\(\alpha\)
Isotropic polarizability
\({a_0}^3\)
homo
\(\epsilon_{\textrm{HOMO}}\)
Highest occupied molecular orbital energy
\(\textrm{eV}\)
lumo
\(\epsilon_{\textrm{LUMO}}\)
Lowest unoccupied molecular orbital energy
\(\textrm{eV}\)
gap
\(\Delta \epsilon\)
Gap between \(\epsilon_{\textrm{HOMO}}\) and \(\epsilon_{\textrm{LUMO}}\)
\(\textrm{eV}\)
r2
\(\langle R^2 \rangle\)
Electronic spatial extent
\({a_0}^2\)
zpve
\(\textrm{ZPVE}\)
Zero point vibrational energy
\(\textrm{eV}\)
U0
\(U_0\)
Internal energy at 0K
\(\textrm{eV}\)
U
\(U\)
Internal energy at 298.15K
\(\textrm{eV}\)
H
\(H\)
Enthalpy at 298.15K
\(\textrm{eV}\)
G
\(G\)
Free energy at 298.15K
\(\textrm{eV}\)
Cv
\(c_{\textrm{v}}\)
Heat capavity at 298.15K
\(\frac{\textrm{cal}}{\textrm{mol K}}\)
U0_atom
\(U_0^{\textrm{ATOM}}\)
Atomization energy at 0K
\(\textrm{eV}\)
U_atom
\(U^{\textrm{ATOM}}\)
Atomization energy at 298.15K
\(\textrm{eV}\)
H_atom
\(H^{\textrm{ATOM}}\)
Atomization enthalpy at 298.15K
\(\textrm{eV}\)
G_atom
\(G^{\textrm{ATOM}}\)
Atomization free energy at 298.15K
\(\textrm{eV}\)
A
\(A\)
Rotational constant
\(\textrm{GHz}\)
B
\(B\)
Rotational constant
\(\textrm{GHz}\)
C
\(C\)
Rotational constant
\(\textrm{GHz}\)
- Parameters
label_keys (list) – Names of the regression property, which should be a subset of the keys in the table above. If not provided, it will load all the labels.
raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/
force_reload (bool) – Whether to reload the dataset. Default: False.
verbose (bool) – Whether to print out progress information. Default: True.
transform (callable, optional) – A transform that takes in a
DGLGraph
object and returns a transformed version. TheDGLGraph
object will be transformed before every access.
- Raises
UserWarning – If the raw data is changed in the remote server by the author.
Examples
>>> data = QM9EdgeDataset(label_keys=['mu', 'alpha']) >>> data.num_labels 2
>>> # iterate over the dataset >>> for graph, labels in data: ... print(graph) # get information of each graph ... print(labels) # get labels of the corresponding graph ... # your code here... >>>
-
__getitem__
(idx)[source]¶ Get graph and label by index
- Parameters
idx (int) – Item index
- Returns
dgl.DGLGraph – The graph contains:
ndata['pos']
: the coordinates of each atomndata['attr']
: the features of each atomedata['edge_attr']
: the features of each bond
Tensor – Property values of molecular graphs
- This dataset differs from