# dgl.model_zoo¶

## Chemistry¶

### Utils¶

 chem.load_pretrained(*args, **kwargs)

### Property Prediction¶

Currently supported model architectures:

• GCNClassifier
• GATClassifier
• MPNN
• SchNet
• MGCN
• AttentiveFP
class dgl.model_zoo.chem.GCNClassifier(**kwargs)[source]

GCN based predictor for multitask prediction on molecular graphs We assume each task requires to perform a binary classification.

Parameters: in_feats (int) – Number of input atom features gcn_hidden_feats (list of int) – gcn_hidden_feats[i] gives the number of output atom features in the i+1-th gcn layer n_tasks (int) – Number of prediction tasks classifier_hidden_feats (int) – Number of molecular graph features in hidden layers of the MLP Classifier dropout (float) – The probability for dropout. Default to be 0., i.e. no dropout is performed.
forward(g, feats)

Multi-task prediction for a batch of molecules

Parameters: g (DGLGraph) – DGLGraph with batch size B for processing multiple molecules in parallel feats (FloatTensor of shape (N, M0)) – Initial features for all atoms in the batch of molecules Soft prediction for all tasks on the batch of molecules FloatTensor of shape (B, n_tasks)
class dgl.model_zoo.chem.GATClassifier(**kwargs)[source]

GAT based predictor for multitask prediction on molecular graphs. We assume each task requires to perform a binary classification.

Parameters: in_feats (int) – Number of input atom features
forward(g, feats)

Multi-task prediction for a batch of molecules

Parameters: g (DGLGraph) – DGLGraph with batch size B for processing multiple molecules in parallel feats (FloatTensor of shape (N, M0)) – Initial features for all atoms in the batch of molecules Soft prediction for all tasks on the batch of molecules FloatTensor of shape (B, n_tasks)
class dgl.model_zoo.chem.MPNNModel(**kwargs)[source]
Parameters: node_input_dim (int) – Dimension of input node feature, default to be 15. edge_input_dim (int) – Dimension of input edge feature, default to be 15. output_dim (int) – Dimension of prediction, default to be 12. node_hidden_dim (int) – Dimension of node feature in hidden layers, default to be 64. edge_hidden_dim (int) – Dimension of edge feature in hidden layers, default to be 128. num_step_message_passing (int) – Number of message passing steps, default to be 6. num_step_set2set (int) – Number of set2set steps num_layer_set2set (int) – Number of set2set layers
forward(g, n_feat, e_feat)[source]

Predict molecule labels

Parameters: g (DGLGraph) – Input DGLGraph for molecule(s) n_feat (tensor of dtype float32 and shape (B1, D1)) – Node features. B1 for number of nodes and D1 for the node feature size. e_feat (tensor of dtype float32 and shape (B2, D2)) – Edge features. B2 for number of edges and D2 for the edge feature size. res Predicted labels
class dgl.model_zoo.chem.SchNet(**kwargs)[source]

SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. (NIPS‘2017)

Parameters: dim (int) – Size for atom embeddings, default to be 64. cutoff (float) – Radius cutoff for RBF, default to be 5.0. output_dim (int) – Number of target properties to predict, default to be 1. width (int) – Width in RBF, default to 1. n_conv (int) – Number of conv (interaction) layers, default to be 1. norm (bool) – Whether to normalize the output atom representations, default to be False. atom_ref (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None. pre_train (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
forward(g, atom_types, edge_distances)[source]

Predict molecule labels

Parameters: g (DGLGraph) – Input DGLGraph for molecule(s) atom_types (int64 tensor of shape (B1)) – Types for atoms in the graph(s), B1 for the number of atoms. edge_distances (float32 tensor of shape (B2, 1)) – Edge distances, B2 for the number of edges. prediction – Model prediction for the batch of graphs, B for the number of graphs, output_dim for the prediction size. float32 tensor of shape (B, output_dim)
class dgl.model_zoo.chem.MGCNModel(**kwargs)[source]

Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective

Parameters: dim (int) – Size for embeddings, default to be 128. width (int) – Width in the RBF layer, default to be 1. cutoff (float) – The maximum distance between nodes, default to be 5.0. edge_dim (int) – Size for edge embedding, default to be 128. out_put_dim (int) – Number of target properties to predict, default to be 1. n_conv (int) – Number of convolutional layers, default to be 3. norm (bool) – Whether to perform normalization, default to be False. atom_ref (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None. pre_train (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
forward(g, atom_types, edge_distances)[source]

Predict molecule labels

Parameters: g (DGLGraph) – Input DGLGraph for molecule(s) atom_types (int64 tensor of shape (B1)) – Types for atoms in the graph(s), B1 for the number of atoms. edge_distances (float32 tensor of shape (B2, 1)) – Edge distances, B2 for the number of edges. prediction – Model prediction for the batch of graphs, B for the number of graphs, output_dim for the prediction size. float32 tensor of shape (B, output_dim)
class dgl.model_zoo.chem.AttentiveFP(**kwargs)[source]

Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism

Parameters: node_feat_size (int) – Size for the input node (atom) features. edge_feat_size (int) – Size for the input edge (bond) features. num_layers (int) – Number of GNN layers. num_timesteps (int) – Number of timesteps for updating the molecular representation with GRU. graph_feat_size (int) – Size of the learned graph representation (molecular fingerprint). output_size (int) – Size of the prediction (target labels). dropout (float) – The probability for performing dropout.
forward(g, node_feats, edge_feats, get_node_weight=False)[source]
Parameters: g (DGLGraph) – Constructed DGLGraphs. node_feats (float32 tensor of shape (V, N1)) – Input node features. V for the number of nodes and N1 for the feature size. edge_feats (float32 tensor of shape (E, N2)) – Input edge features. E for the number of edges and N2 for the feature size. get_node_weight (bool) – Whether to get the weights of atoms during readout. float32 tensor of shape (G, N3) – Prediction for the graphs. G for the number of graphs and N3 for the output size. node_weights (list of float32 tensors of shape (V, 1)) – Weights of nodes in all readout operations.

### Generative Models¶

Currently supported model architectures:

• DGMG
• JTNN
class dgl.model_zoo.chem.DGMG(**kwargs)[source]

DGMG model

Learning Deep Generative Models of Graphs

Users only need to initialize an instance of this class.

Parameters: atom_types (list) – E.g. [‘C’, ‘N’] bond_types (list) – E.g. [Chem.rdchem.BondType.SINGLE, Chem.rdchem.BondType.DOUBLE, Chem.rdchem.BondType.TRIPLE, Chem.rdchem.BondType.AROMATIC] node_hidden_size (int) – Size of atom representation num_prop_rounds (int) – Number of message passing rounds for each time dropout (float) – Probability for dropout
forward(actions=None, rdkit_mol=False, compute_log_prob=False, max_num_steps=400)[source]
Parameters: actions (list of 2-tuples or None.) – If actions are not None, generate a molecule according to actions. Otherwise, a molecule will be generated based on sampled actions. rdkit_mol (bool) – Whether to maintain a Chem.rdchem.Mol object. This brings extra computational cost, but is necessary if we are interested in learning the generated molecule. compute_log_prob (bool) – Whether to compute log likelihood max_num_steps (int) – Maximum number of steps allowed. This only comes into effect during inference and prevents the model from not stopping. torch.tensor consisting of a float only, optional – The log likelihood for the actions taken str, optional – The generated molecule in the form of SMILES
class dgl.model_zoo.chem.DGLJTNNVAE(**kwargs)[source]

Junction Tree Variational Autoencoder for Molecular Graph Generation

forward(mol_batch, beta=0, e1=None, e2=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

### Protein Ligand Binding¶

Currently supported model architectures:

• ACNN
class dgl.model_zoo.chem.ACNN(**kwargs)[source]

Atomic Convolutional Networks.

The model was proposed in Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity.

Parameters: hidden_sizes (list of int) – Specifying the hidden sizes for all layers in the predictor. weight_init_stddevs (list of float) – Specifying the standard deviations to use for truncated normal distributions in initialzing weights for the predictor. dropouts (list of float) – Specifying the dropouts to use for all layers in the predictor. features_to_use (None or float tensor of shape (T)) – In the original paper, these are atomic numbers to consider, representing the types of atoms. T for the number of types of atomic numbers. Default to None. radial (None or list) – If not None, the list consists of 3 lists of floats, separately for the options of interaction cutoff, the options of rbf kernel mean and the options of rbf kernel scaling. If None, a default option of [[12.0], [0.0, 2.0, 4.0, 6.0, 8.0], [4.0]] will be used. num_tasks (int) – Number of output tasks.
forward(graph)[source]

Apply the model for prediction.

Parameters: graph (DGLHeteroGraph) – DGLHeteroGraph consisting of the ligand graph, the protein graph and the complex graph, along with preprocessed features. Predicted protein-ligand binding affinity. B for the number of protein-ligand pairs in the batch and O for the number of tasks. Float32 tensor of shape (B, O)