Model Zoo¶
Chemistry¶
Utils¶
chem.load_pretrained (model_name[, log]) 
Load a pretrained model 
Property Prediction¶
Currently supported model architectures:
 GCNClassifier
 GATClassifier
 MPNN
 SchNet
 MGCN
 AttentiveFP

class
dgl.model_zoo.chem.
GCNClassifier
(in_feats, gcn_hidden_feats, n_tasks, classifier_hidden_feats=128, dropout=0.0)[source]¶ GCN based predictor for multitask prediction on molecular graphs We assume each task requires to perform a binary classification.
Parameters:  in_feats (int) – Number of input atom features
 gcn_hidden_feats (list of int) – gcn_hidden_feats[i] gives the number of output atom features in the i+1th gcn layer
 n_tasks (int) – Number of prediction tasks
 classifier_hidden_feats (int) – Number of molecular graph features in hidden layers of the MLP Classifier
 dropout (float) – The probability for dropout. Default to be 0., i.e. no dropout is performed.

forward
(bg, feats)¶ Multitask prediction for a batch of molecules
Parameters:  bg (BatchedDGLGraph) – B Batched DGLGraphs for processing multiple molecules in parallel
 feats (FloatTensor of shape (N, M0)) – Initial features for all atoms in the batch of molecules
Returns: Soft prediction for all tasks on the batch of molecules
Return type: FloatTensor of shape (B, n_tasks)

class
dgl.model_zoo.chem.
GATClassifier
(in_feats, gat_hidden_feats, num_heads, n_tasks, classifier_hidden_feats=128, dropout=0)[source]¶ GAT based predictor for multitask prediction on molecular graphs. We assume each task requires to perform a binary classification.
Parameters: in_feats (int) – Number of input atom features 
forward
(bg, feats)¶ Multitask prediction for a batch of molecules
Parameters:  bg (BatchedDGLGraph) – B Batched DGLGraphs for processing multiple molecules in parallel
 feats (FloatTensor of shape (N, M0)) – Initial features for all atoms in the batch of molecules
Returns: Soft prediction for all tasks on the batch of molecules
Return type: FloatTensor of shape (B, n_tasks)


class
dgl.model_zoo.chem.
MPNNModel
(node_input_dim=15, edge_input_dim=5, output_dim=12, node_hidden_dim=64, edge_hidden_dim=128, num_step_message_passing=6, num_step_set2set=6, num_layer_set2set=3)[source]¶ MPNN from Neural Message Passing for Quantum Chemistry
Parameters:  node_input_dim (int) – Dimension of input node feature, default to be 15.
 edge_input_dim (int) – Dimension of input edge feature, default to be 15.
 output_dim (int) – Dimension of prediction, default to be 12.
 node_hidden_dim (int) – Dimension of node feature in hidden layers, default to be 64.
 edge_hidden_dim (int) – Dimension of edge feature in hidden layers, default to be 128.
 num_step_message_passing (int) – Number of message passing steps, default to be 6.
 num_step_set2set (int) – Number of set2set steps
 num_layer_set2set (int) – Number of set2set layers

forward
(g, n_feat, e_feat)[source]¶ Predict molecule labels
Parameters:  g (DGLGraph) – Input DGLGraph for molecule(s)
 n_feat (tensor of dtype float32 and shape (B1, D1)) – Node features. B1 for number of nodes and D1 for the node feature size.
 e_feat (tensor of dtype float32 and shape (B2, D2)) – Edge features. B2 for number of edges and D2 for the edge feature size.
Returns: res
Return type: Predicted labels

class
dgl.model_zoo.chem.
SchNet
(dim=64, cutoff=5.0, output_dim=1, width=1, n_conv=3, norm=False, atom_ref=None, pre_train=None)[source]¶ 
Parameters:  dim (int) – Size for atom embeddings, default to be 64.
 cutoff (float) – Radius cutoff for RBF, default to be 5.0.
 output_dim (int) – Number of target properties to predict, default to be 1.
 width (int) – Width in RBF, default to 1.
 n_conv (int) – Number of conv (interaction) layers, default to be 1.
 norm (bool) – Whether to normalize the output atom representations, default to be False.
 atom_ref (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
 pre_train (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.

forward
(g, atom_types, edge_distances)[source]¶ Predict molecule labels
Parameters:  g (DGLGraph) – Input DGLGraph for molecule(s)
 atom_types (int64 tensor of shape (B1)) – Types for atoms in the graph(s), B1 for the number of atoms.
 edge_distances (float32 tensor of shape (B2, 1)) – Edge distances, B2 for the number of edges.
Returns: prediction – Model prediction for the batch of graphs, B for the number of graphs, output_dim for the prediction size.
Return type: float32 tensor of shape (B, output_dim)

class
dgl.model_zoo.chem.
MGCNModel
(dim=128, width=1, cutoff=5.0, edge_dim=128, output_dim=1, n_conv=3, norm=False, atom_ref=None, pre_train=None)[source]¶ Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective
Parameters:  dim (int) – Size for embeddings, default to be 128.
 width (int) – Width in the RBF layer, default to be 1.
 cutoff (float) – The maximum distance between nodes, default to be 5.0.
 edge_dim (int) – Size for edge embedding, default to be 128.
 out_put_dim (int) – Number of target properties to predict, default to be 1.
 n_conv (int) – Number of convolutional layers, default to be 3.
 norm (bool) – Whether to perform normalization, default to be False.
 atom_ref (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
 pre_train (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.

forward
(g, atom_types, edge_distances)[source]¶ Predict molecule labels
Parameters:  g (DGLGraph) – Input DGLGraph for molecule(s)
 atom_types (int64 tensor of shape (B1)) – Types for atoms in the graph(s), B1 for the number of atoms.
 edge_distances (float32 tensor of shape (B2, 1)) – Edge distances, B2 for the number of edges.
Returns: prediction – Model prediction for the batch of graphs, B for the number of graphs, output_dim for the prediction size.
Return type: float32 tensor of shape (B, output_dim)

class
dgl.model_zoo.chem.
AttentiveFP
(node_feat_size, edge_feat_size, num_layers, num_timesteps, graph_feat_size, output_size, dropout)[source]¶ 
Parameters:  node_feat_size (int) – Size for the input node (atom) features.
 edge_feat_size (int) – Size for the input edge (bond) features.
 num_layers (int) – Number of GNN layers.
 num_timesteps (int) – Number of timesteps for updating the molecular representation with GRU.
 graph_feat_size (int) – Size of the learned graph representation (molecular fingerprint).
 output_size (int) – Size of the prediction (target labels).
 dropout (float) – The probability for performing dropout.

forward
(g, node_feats, edge_feats, get_node_weight=False)[source]¶ Parameters:  g (DGLGraph or BatchedDGLGraph) – Constructed DGLGraphs.
 node_feats (float32 tensor of shape (V, N1)) – Input node features. V for the number of nodes and N1 for the feature size.
 edge_feats (float32 tensor of shape (E, N2)) – Input edge features. E for the number of edges and N2 for the feature size.
 get_node_weight (bool) – Whether to get the weights of atoms during readout.
Returns:  float32 tensor of shape (G, N3) – Prediction for the graphs. G for the number of graphs and N3 for the output size.
 node_weights (list of float32 tensors of shape (V, 1)) – Weights of nodes in all readout operations.
Generative Models¶
Currently supported model architectures:
 DGMG
 JTNN

class
dgl.model_zoo.chem.
DGMG
(atom_types, bond_types, node_hidden_size, num_prop_rounds, dropout)[source]¶ DGMG model
Learning Deep Generative Models of Graphs
Users only need to initialize an instance of this class.
Parameters:  atom_types (list) – E.g. [‘C’, ‘N’]
 bond_types (list) – E.g. [Chem.rdchem.BondType.SINGLE, Chem.rdchem.BondType.DOUBLE, Chem.rdchem.BondType.TRIPLE, Chem.rdchem.BondType.AROMATIC]
 node_hidden_size (int) – Size of atom representation
 num_prop_rounds (int) – Number of message passing rounds for each time
 dropout (float) – Probability for dropout

forward
(actions=None, rdkit_mol=False, compute_log_prob=False, max_num_steps=400)[source]¶ Parameters:  actions (list of 2tuples or None.) – If actions are not None, generate a molecule according to actions. Otherwise, a molecule will be generated based on sampled actions.
 rdkit_mol (bool) – Whether to maintain a Chem.rdchem.Mol object. This brings extra computational cost, but is necessary if we are interested in learning the generated molecule.
 compute_log_prob (bool) – Whether to compute log likelihood
 max_num_steps (int) – Maximum number of steps allowed. This only comes into effect during inference and prevents the model from not stopping.
Returns:  torch.tensor consisting of a float only, optional – The log likelihood for the actions taken
 str, optional – The generated molecule in the form of SMILES

class
dgl.model_zoo.chem.
DGLJTNNVAE
(hidden_size, latent_size, depth, vocab=None, vocab_file=None)[source]¶ Junction Tree Variational Autoencoder for Molecular Graph Generation

forward
(mol_batch, beta=0, e1=None, e2=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Protein Ligand Binding¶
Currently supported model architectures:
 ACNN

class
dgl.model_zoo.chem.
ACNN
(hidden_sizes, weight_init_stddevs, dropouts, features_to_use=None, radial=None, num_tasks=1)[source]¶ Atomic Convolutional Networks.
The model was proposed in Atomic Convolutional Networks for Predicting ProteinLigand Binding Affinity.
Parameters:  hidden_sizes (list of int) – Specifying the hidden sizes for all layers in the predictor.
 weight_init_stddevs (list of float) – Specifying the standard deviations to use for truncated normal distributions in initialzing weights for the predictor.
 dropouts (list of float) – Specifying the dropouts to use for all layers in the predictor.
 features_to_use (None or float tensor of shape (T)) – In the original paper, these are atomic numbers to consider, representing the types of atoms. T for the number of types of atomic numbers. Default to None.
 radial (None or list) – If not None, the list consists of 3 lists of floats, separately for the
options of interaction cutoff, the options of rbf kernel mean and the
options of rbf kernel scaling. If None, a default option of
[[12.0], [0.0, 2.0, 4.0, 6.0, 8.0], [4.0]]
will be used.  num_tasks (int) – Number of output tasks.

forward
(graph)[source]¶ Apply the model for prediction.
Parameters: graph (DGLHeteroGraph) – DGLHeteroGraph consisting of the ligand graph, the protein graph and the complex graph, along with preprocessed features. Returns: Predicted proteinligand binding affinity. B for the number of proteinligand pairs in the batch and O for the number of tasks. Return type: Float32 tensor of shape (B, O)