QM9Dataset¶
-
class
dgl.data.
QM9Dataset
(label_keys, cutoff=5.0, raw_dir=None, force_reload=False, verbose=False, transform=None)[source]¶ Bases:
dgl.data.dgl_dataset.DGLDataset
QM9 dataset for graph property prediction (regression)
This dataset consists of 130,831 molecules with 12 regression targets. Nodes correspond to atoms and edges correspond to close atom pairs.
- This dataset differs from
QM9EdgeDataset
in the following aspects: Edges in this dataset are purely distance-based.
It only provides atoms’ coordinates and atomic numbers as node features
It only provides 12 regression targets.
Reference:
Statistics:
Number of graphs: 130,831
Number of regression targets: 12
Keys
Property
Description
Unit
mu
\(\mu\)
Dipole moment
\(\textrm{D}\)
alpha
\(\alpha\)
Isotropic polarizability
\({a_0}^3\)
homo
\(\epsilon_{\textrm{HOMO}}\)
Highest occupied molecular orbital energy
\(\textrm{eV}\)
lumo
\(\epsilon_{\textrm{LUMO}}\)
Lowest unoccupied molecular orbital energy
\(\textrm{eV}\)
gap
\(\Delta \epsilon\)
Gap between \(\epsilon_{\textrm{HOMO}}\) and \(\epsilon_{\textrm{LUMO}}\)
\(\textrm{eV}\)
r2
\(\langle R^2 \rangle\)
Electronic spatial extent
\({a_0}^2\)
zpve
\(\textrm{ZPVE}\)
Zero point vibrational energy
\(\textrm{eV}\)
U0
\(U_0\)
Internal energy at 0K
\(\textrm{eV}\)
U
\(U\)
Internal energy at 298.15K
\(\textrm{eV}\)
H
\(H\)
Enthalpy at 298.15K
\(\textrm{eV}\)
G
\(G\)
Free energy at 298.15K
\(\textrm{eV}\)
Cv
\(c_{\textrm{v}}\)
Heat capavity at 298.15K
\(\frac{\textrm{cal}}{\textrm{mol K}}\)
- Parameters
label_keys (list) – Names of the regression property, which should be a subset of the keys in the table above.
cutoff (float) – Cutoff distance for interatomic interactions, i.e. two atoms are connected in the corresponding graph if the distance between them is no larger than this. Default: 5.0 Angstrom
raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/
force_reload (bool) – Whether to reload the dataset. Default: False
verbose (bool) – Whether to print out progress information. Default: True.
transform (callable, optional) – A transform that takes in a
DGLGraph
object and returns a transformed version. TheDGLGraph
object will be transformed before every access.
- Raises
UserWarning – If the raw data is changed in the remote server by the author.
Examples
>>> data = QM9Dataset(label_keys=['mu', 'gap'], cutoff=5.0) >>> data.num_labels 2 >>> >>> # iterate over the dataset >>> for g, label in data: ... R = g.ndata['R'] # get coordinates of each atom ... Z = g.ndata['Z'] # get atomic numbers of each atom ... # your code here... >>>
- This dataset differs from