NN Modules (MXNet)

We welcome your contribution! If you want a model to be implemented in DGL as a NN module, please create an issue started with “[Feature Request] NN Module XXXModel”.

If you want to contribute a NN module, please create a pull request started with “[NN] XXXModel in MXNet NN Modules” and our team member would review this PR.

Conv Layers

MXNet modules for graph convolutions.

GraphConv

class dgl.nn.mxnet.conv.GraphConv(in_feats, out_feats, norm=True, bias=True, activation=None)[source]

Bases: mxnet.gluon.block.Block

Apply graph convolution over an input signal.

Graph convolution is introduced in GCN and can be described as below:

\[h_i^{(l+1)} = \sigma(b^{(l)} + \sum_{j\in\mathcal{N}(i)}\frac{1}{c_{ij}}h_j^{(l)}W^{(l)})\]

where \(\mathcal{N}(i)\) is the neighbor set of node \(i\). \(c_{ij}\) is equal to the product of the square root of node degrees: \(\sqrt{|\mathcal{N}(i)|}\sqrt{|\mathcal{N}(j)|}\). \(\sigma\) is an activation function.

The model parameters are initialized as in the original implementation where the weight \(W^{(l)}\) is initialized using Glorot uniform initialization and the bias is initialized to be zero.

Notes

Zero in degree nodes could lead to invalid normalizer. A common practice to avoid this is to add a self-loop for each node in the graph, which can be achieved by:

>>> g = ... # some DGLGraph
>>> g.add_edges(g.nodes(), g.nodes())
Parameters:
  • in_feats (int) – Number of input features.
  • out_feats (int) – Number of output features.
  • norm (bool, optional) – If True, the normalizer \(c_{ij}\) is applied. Default: True.
  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True.
  • activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.
weight

mxnet.gluon.parameter.Parameter – The learnable weight tensor.

bias

mxnet.gluon.parameter.Parameter – The learnable bias tensor.

forward(graph, feat)[source]

Compute graph convolution.

Notes

  • Input shape: \((N, *, \text{in_feats})\) where * means any number of additional dimensions, \(N\) is the number of nodes.
  • Output shape: \((N, *, \text{out_feats})\) where all but the last dimension are the same shape as the input.
Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature
Returns:

The output feature

Return type:

mxnet.NDArray

RelGraphConv

class dgl.nn.mxnet.conv.RelGraphConv(in_feat, out_feat, num_rels, regularizer='basis', num_bases=None, bias=True, activation=None, self_loop=False, dropout=0.0)[source]

Bases: mxnet.gluon.block.Block

Relational graph convolution layer.

Relational graph convolution is introduced in “Modeling Relational Data with Graph Convolutional Networks” and can be described as below:

\[h_i^{(l+1)} = \sigma(\sum_{r\in\mathcal{R}} \sum_{j\in\mathcal{N}^r(i)}\frac{1}{c_{i,r}}W_r^{(l)}h_j^{(l)}+W_0^{(l)}h_i^{(l)})\]

where \(\mathcal{N}^r(i)\) is the neighbor set of node \(i\) w.r.t. relation \(r\). \(c_{i,r}\) is the normalizer equal to \(|\mathcal{N}^r(i)|\). \(\sigma\) is an activation function. \(W_0\) is the self-loop weight.

The basis regularization decomposes \(W_r\) by:

\[W_r^{(l)} = \sum_{b=1}^B a_{rb}^{(l)}V_b^{(l)}\]

where \(B\) is the number of bases.

The block-diagonal-decomposition regularization decomposes \(W_r\) into \(B\) number of block diagonal matrices. We refer \(B\) as the number of bases.

Parameters:
  • in_feat (int) – Input feature size.
  • out_feat (int) – Output feature size.
  • num_rels (int) – Number of relations.
  • regularizer (str) – Which weight regularizer to use “basis” or “bdd”
  • num_bases (int, optional) – Number of bases. If is none, use number of relations. Default: None.
  • bias (bool, optional) – True if bias is added. Default: True
  • activation (callable, optional) – Activation function. Default: None
  • self_loop (bool, optional) – True to include self loop message. Default: False
  • dropout (float, optional) – Dropout rate. Default: 0.0
forward(g, x, etypes, norm=None)[source]

Forward computation

Parameters:
  • g (DGLGraph) – The graph.
  • x (mx.ndarray.NDArray) –
    Input node features. Could be either
    • \((|V|, D)\) dense tensor
    • \((|V|,)\) int64 vector, representing the categorical values of each node. We then treat the input feature as an one-hot encoding feature.
  • etypes (mx.ndarray.NDArray) – Edge type tensor. Shape: \((|E|,)\)
  • norm (mx.ndarray.NDArray) – Optional edge normalizer tensor. Shape: \((|E|, 1)\)
Returns:

New node features.

Return type:

mx.ndarray.NDArray

TAGConv

class dgl.nn.mxnet.conv.TAGConv(in_feats, out_feats, k=2, bias=True, activation=None)[source]

Bases: mxnet.gluon.block.Block

Apply Topology Adaptive Graph Convolutional Network

\[\mathbf{X}^{\prime} = \sum_{k=0}^K \mathbf{D}^{-1/2} \mathbf{A} \mathbf{D}^{-1/2}\mathbf{X} \mathbf{\Theta}_{k},\]

where \(\mathbf{A}\) denotes the adjacency matrix and \(D_{ii} = \sum_{j=0} A_{ij}\) its diagonal degree matrix.

Parameters:
  • in_feats (int) – Number of input features.
  • out_feats (int) – Number of output features.
  • k (int, optional) – Number of hops :math: k. (default: 2)
  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True.
  • activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.
lin

mxnet.gluon.parameter.Parameter – The learnable weight tensor.

bias

mxnet.gluon.parameter.Parameter – The learnable bias tensor.

forward(graph, feat)[source]

Compute graph convolution

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is size of output feature.

Return type:

mxnet.NDArray

GATConv

class dgl.nn.mxnet.conv.GATConv(in_feats, out_feats, num_heads, feat_drop=0.0, attn_drop=0.0, negative_slope=0.2, residual=False, activation=None)[source]

Bases: mxnet.gluon.block.Block

Apply Graph Attention Network over an input signal.

\[h_i^{(l+1)} = \sum_{j\in \mathcal{N}(i)} \alpha_{i,j} W^{(l)} h_j^{(l)}\]

where \(\alpha_{ij}\) is the attention score bewteen node \(i\) and node \(j\):

\[ \begin{align}\begin{aligned}\alpha_{ij}^{l} & = \mathrm{softmax_i} (e_{ij}^{l})\\e_{ij}^{l} & = \mathrm{LeakyReLU}\left(\vec{a}^T [W h_{i} \| W h_{j}]\right)\end{aligned}\end{align} \]
Parameters:
  • in_feats (int) – Input feature size.
  • out_feats (int) – Output feature size.
  • num_heads (int) – Number of heads in Multi-Head Attention.
  • feat_drop (float, optional) – Dropout rate on feature, defaults: 0.
  • attn_drop (float, optional) – Dropout rate on attention weight, defaults: 0.
  • negative_slope (float, optional) – LeakyReLU angle of negative slope.
  • residual (bool, optional) – If True, use residual connection.
  • activation (callable activation function/layer or None, optional.) – If not None, applies an activation function to the updated node features. Default: None.
forward(graph, feat)[source]

Compute graph attention network layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
Returns:

The output feature of shape \((N, H, D_{out})\) where \(H\) is the number of heads, and \(D_{out}\) is size of output feature.

Return type:

mxnet.NDArray

EdgeConv

class dgl.nn.mxnet.conv.EdgeConv(in_feat, out_feat, batch_norm=False)[source]

Bases: mxnet.gluon.block.Block

EdgeConv layer.

Introduced in “Dynamic Graph CNN for Learning on Point Clouds”. Can be described as follows:

\[x_i^{(l+1)} = \max_{j \in \mathcal{N}(i)} \mathrm{ReLU}( \Theta \cdot (x_j^{(l)} - x_i^{(l)}) + \Phi \cdot x_i^{(l)})\]

where \(\mathcal{N}(i)\) is the neighbor of \(i\).

Parameters:
  • in_feat (int) – Input feature size.
  • out_feat (int) – Output feature size.
  • batch_norm (bool) – Whether to include batch normalization on messages.
forward(g, h)[source]

Forward computation

Parameters:
  • g (DGLGraph) – The graph.
  • h (mxnet.NDArray) – \((N, D)\) where \(N\) is the number of nodes and \(D\) is the number of feature dimensions.
Returns:

New node features.

Return type:

mxnet.NDArray

SAGEConv

class dgl.nn.mxnet.conv.SAGEConv(in_feats, out_feats, aggregator_type='mean', feat_drop=0.0, bias=True, norm=None, activation=None)[source]

Bases: mxnet.gluon.block.Block

GraphSAGE layer from paper Inductive Representation Learning on Large Graphs.

\[ \begin{align}\begin{aligned}h_{\mathcal{N}(i)}^{(l+1)} & = \mathrm{aggregate} \left(\{h_{j}^{l}, \forall j \in \mathcal{N}(i) \}\right)\\h_{i}^{(l+1)} & = \sigma \left(W \cdot \mathrm{concat} (h_{i}^{l}, h_{\mathcal{N}(i)}^{l+1} + b) \right)\\h_{i}^{(l+1)} & = \mathrm{norm}(h_{i}^{l})\end{aligned}\end{align} \]
Parameters:
  • in_feats (int) – Input feature size.
  • out_feats (int) – Output feature size.
  • feat_drop (float) – Dropout rate on features, default: 0.
  • aggregator_type (str) – Aggregator type to use (mean, gcn, pool, lstm).
  • bias (bool) – If True, adds a learnable bias to the output. Default: True.
  • norm (callable activation function/layer or None, optional) – If not None, applies normalization to the updated node features.
  • activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.
forward(graph, feat)[source]

Compute GraphSAGE layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is size of output feature.

Return type:

mxnet.NDArray

SGConv

class dgl.nn.mxnet.conv.SGConv(in_feats, out_feats, k=1, cached=False, bias=True, norm=None)[source]

Bases: mxnet.gluon.block.Block

Simplifying Graph Convolution layer from paper Simplifying Graph Convolutional Networks.

\[H^{l+1} = (\hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2})^K H^{l} \Theta^{l}\]
Parameters:
  • in_feats (int) – Number of input features.
  • out_feats (int) – Number of output features.
  • k (int) – Number of hops \(K\). Defaults:1.
  • cached (bool) –

    If True, the module would cache

    \[(\hat{D}^{-\frac{1}{2}}\hat{A}\hat{D}^{-\frac{1}{2}})^K X\Theta\]

    at the first forward call. This parameter should only be set to True in Transductive Learning setting.

  • bias (bool) – If True, adds a learnable bias to the output. Default: True.
  • norm (callable activation function/layer or None, optional) – If not None, applies normalization to the updated node features.
forward(graph, feat)[source]

Compute Simplifying Graph Convolution layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is size of output feature.

Return type:

mxnet.NDArray

Notes

If cache is se to True, feat and graph should not change during training, or you will get wrong results.

APPNPConv

class dgl.nn.mxnet.conv.APPNPConv(k, alpha, edge_drop=0.0)[source]

Bases: mxnet.gluon.block.Block

Approximate Personalized Propagation of Neural Predictions layer from paper Predict then Propagate: Graph Neural Networks meet Personalized PageRank.

\[ \begin{align}\begin{aligned}H^{0} & = X\\H^{t+1} & = (1-\alpha)\left(\hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} H^{t} + \alpha H^{0}\right)\end{aligned}\end{align} \]
Parameters:
  • k (int) – Number of iterations \(K\).
  • alpha (float) – The teleport probability \(\alpha\).
  • edge_drop (float, optional) – Dropout rate on edges that controls the messages received by each node. Default: 0.
forward(graph, feat)[source]

Compute APPNP layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mx.NDArray) – The input feature of shape \((N, *)\) \(N\) is the number of nodes, and \(*\) could be of any shape.
Returns:

The output feature of shape \((N, *)\) where \(*\) should be the same as input shape.

Return type:

mx.NDArray

GINConv

class dgl.nn.mxnet.conv.GINConv(apply_func, aggregator_type, init_eps=0, learn_eps=False)[source]

Bases: mxnet.gluon.block.Block

Graph Isomorphism Network layer from paper How Powerful are Graph Neural Networks?.

\[h_i^{(l+1)} = f_\Theta \left((1 + \epsilon) h_i^{l} + \mathrm{aggregate}\left(\left\{h_j^{l}, j\in\mathcal{N}(i) \right\}\right)\right)\]
Parameters:
  • apply_func (callable activation function/layer or None) – If not None, apply this function to the updated node feature, the \(f_\Theta\) in the formula.
  • aggregator_type (str) – Aggregator type to use (sum, max or mean).
  • init_eps (float, optional) – Initial \(\epsilon\) value, default: 0.
  • learn_eps (bool, optional) – If True, \(\epsilon\) will be a learnable parameter.
forward(graph, feat)[source]

Compute Graph Isomorphism Network layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (torch.Tensor) – The input feature of shape \((N, D)\) where \(D\) could be any positive integer, \(N\) is the number of nodes. If apply_func is not None, \(D\) should fit the input dimensionality requirement of apply_func.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is the output dimensionality of apply_func. If apply_func is None, \(D_{out}\) should be the same as input dimensionality.

Return type:

torch.Tensor

GatedGraphConv

class dgl.nn.mxnet.conv.GatedGraphConv(in_feats, out_feats, n_steps, n_etypes, bias=True)[source]

Bases: mxnet.gluon.block.Block

Gated Graph Convolution layer from paper Gated Graph Sequence Neural Networks.

\[ \begin{align}\begin{aligned}h_{i}^{0} & = [ x_i \| \mathbf{0} ]\\a_{i}^{t} & = \sum_{j\in\mathcal{N}(i)} W_{e_{ij}} h_{j}^{t}\\h_{i}^{t+1} & = \mathrm{GRU}(a_{i}^{t}, h_{i}^{t})\end{aligned}\end{align} \]
Parameters:
  • in_feats (int) – Input feature size.
  • out_feats (int) – Output feature size.
  • n_steps (int) – Number of recurrent steps.
  • n_etypes (int) – Number of edge types.
  • bias (bool) – If True, adds a learnable bias to the output. Default: True. Can only be set to True in MXNet.
forward(graph, feat, etypes)[source]

Compute Gated Graph Convolution layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(N\) is the number of nodes of the graph and \(D_{in}\) is the input feature size.
  • etypes (torch.LongTensor) – The edge type tensor of shape \((E,)\) where \(E\) is the number of edges of the graph.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is the output feature size.

Return type:

mxnet.NDArray

GMMConv

class dgl.nn.mxnet.conv.GMMConv(in_feats, out_feats, dim, n_kernels, aggregator_type='sum', residual=False, bias=True)[source]

Bases: mxnet.gluon.block.Block

The Gaussian Mixture Model Convolution layer from Geometric Deep Learning on Graphs and Manifolds using Mixture Model CNNs.

\[ \begin{align}\begin{aligned}h_i^{l+1} & = \mathrm{aggregate}\left(\left\{\frac{1}{K} \sum_{k}^{K} w_k(u_{ij}), \forall j\in \mathcal{N}(i)\right\}\right)\\w_k(u) & = \exp\left(-\frac{1}{2}(u-\mu_k)^T \Sigma_k^{-1} (u - \mu_k)\right)\end{aligned}\end{align} \]
Parameters:
  • in_feats (int) – Number of input features.
  • out_feats (int) – Number of output features.
  • dim (int) – Dimensionality of pseudo-coordinte.
  • n_kernels (int) – Number of kernels \(K\).
  • aggregator_type (str) – Aggregator type (sum, mean, max). Default: sum.
  • residual (bool) – If True, use residual connection inside this layer. Default: False.
  • bias (bool) – If True, adds a learnable bias to the output. Default: True.
forward(graph, feat, pseudo)[source]

Compute Gaussian Mixture Model Convolution layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(N\) is the number of nodes of the graph and \(D_{in}\) is the input feature size.
  • pseudo (mxnet.NDArray) – The pseudo coordinate tensor of shape \((E, D_{u})\) where \(E\) is the number of edges of the graph and \(D_{u}\) is the dimensionality of pseudo coordinate.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is the output feature size.

Return type:

mxnet.NDArray

ChebConv

class dgl.nn.mxnet.conv.ChebConv(in_feats, out_feats, k, bias=True)[source]

Bases: mxnet.gluon.block.Block

Chebyshev Spectral Graph Convolution layer from paper Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.

\[ \begin{align}\begin{aligned}h_i^{l+1} &= \sum_{k=0}^{K-1} W^{k, l}z_i^{k, l}\\Z^{0, l} &= H^{l}\\Z^{1, l} &= \hat{L} \cdot H^{l}\\Z^{k, l} &= 2 \cdot \hat{L} \cdot Z^{k-1, l} - Z^{k-2, l}\\\hat{L} &= 2\left(I - \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2}\right)/\lambda_{max} - I\end{aligned}\end{align} \]
Parameters:
  • in_feats (int) – Number of input features.
  • out_feats (int) – Number of output features.
  • k (int) – Chebyshev filter size.
  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True.
forward(graph, feat, lambda_max=None)[source]

Compute ChebNet layer.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
  • lambda_max (list or mxnet.NDArray or None, optional.) – A list(tensor) with length \(B\), stores the largest eigenvalue of the normalized laplacian of each individual graph in graph, where \(B\) is the batch size of the input graph. Default: None. If None, this method would compute the list by calling dgl.laplacian_lambda_max.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is size of output feature.

Return type:

mxnet.NDArray

AGNNConv

class dgl.nn.mxnet.conv.AGNNConv(init_beta=1.0, learn_beta=True)[source]

Bases: mxnet.gluon.block.Block

Attention-based Graph Neural Network layer from paper Attention-based Graph Neural Network for Semi-Supervised Learning.

\[H^{l+1} = P H^{l}\]

where \(P\) is computed as:

\[P_{ij} = \mathrm{softmax}_i ( \beta \cdot \cos(h_i^l, h_j^l))\]
Parameters:
  • init_beta (float, optional) – The \(\beta\) in the formula.
  • learn_beta (bool, optional) – If True, \(\beta\) will be learnable parameter.
forward(graph, feat)[source]

Compute AGNN Layer.

Parameters:
  • graph (DGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature of shape \((N, *)\) \(N\) is the number of nodes, and \(*\) could be of any shape.
Returns:

The output feature of shape \((N, *)\) where \(*\) should be the same as input shape.

Return type:

mxnet.NDArray

Dense Conv Layers

DenseGraphConv

class dgl.nn.mxnet.conv.DenseGraphConv(in_feats, out_feats, norm=True, bias=True, activation=None)[source]

Bases: mxnet.gluon.block.Block

Graph Convolutional Network layer where the graph structure is given by an adjacency matrix. We recommend user to use this module when inducing graph convolution on dense graphs / k-hop graphs.

Parameters:
  • in_feats (int) – Input feature size.
  • out_feats (int) – Output feature size.
  • norm (bool) – If True, the normalizer \(c_{ij}\) is applied. Default: True.
  • bias (bool) – If True, adds a learnable bias to the output. Default: True.
  • activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.

See also

GraphConv

forward(adj, feat)[source]

Compute (Dense) Graph Convolution layer.

Parameters:
  • adj (mxnet.NDArray) – The adjacency matrix of the graph to apply Graph Convolution on, should be of shape \((N, N)\), where a row represents the destination and a column represents the source.
  • feat (mxnet.NDArray) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is size of output feature.

Return type:

mxnet.NDArray

DenseChebConv

class dgl.nn.pytorch.conv.DenseChebConv(in_feats, out_feats, k, bias=True)[source]

Bases: torch.nn.modules.module.Module

Chebyshev Spectral Graph Convolution layer from paper Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.

We recommend to use this module when inducing ChebConv operations on dense graphs / k-hop graphs.

Parameters:
  • in_feats (int) – Number of input features.
  • out_feats (int) – Number of output features.
  • k (int) – Chebyshev filter size.
  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True.

See also

ChebConv

forward(adj, feat, lambda_max=None)[source]

Compute (Dense) Chebyshev Spectral Graph Convolution layer.

Parameters:
  • adj (torch.Tensor) – The adjacency matrix of the graph to apply Graph Convolution on, should be of shape \((N, N)\), where a row represents the destination and a column represents the source.
  • feat (torch.Tensor) – The input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes.
  • lambda_max (float or None, optional) – A float value indicates the largest eigenvalue of given graph. Default: None.
Returns:

The output feature of shape \((N, D_{out})\) where \(D_{out}\) is size of output feature.

Return type:

torch.Tensor

Global Pooling Layers

MXNet modules for graph global pooling.

SumPooling

class dgl.nn.mxnet.glob.SumPooling[source]

Bases: mxnet.gluon.block.Block

Apply sum pooling over the nodes in the graph.

\[r^{(i)} = \sum_{k=1}^{N_i} x^{(i)}_k\]
forward(graph, feat)[source]

Compute sum pooling.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature with shape \((N, *)\) where \(N\) is the number of nodes in the graph.
Returns:

The output feature with shape \((*)\) (if input graph is a BatchedDGLGraph, the result shape would be \((B, *)\).

Return type:

mxnet.NDArray

AvgPooling

class dgl.nn.mxnet.glob.AvgPooling[source]

Bases: mxnet.gluon.block.Block

Apply average pooling over the nodes in the graph.

\[r^{(i)} = \frac{1}{N_i}\sum_{k=1}^{N_i} x^{(i)}_k\]
forward(graph, feat)[source]

Compute average pooling.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature with shape \((N, *)\) where \(N\) is the number of nodes in the graph.
Returns:

The output feature with shape \((*)\) (if input graph is a BatchedDGLGraph, the result shape would be \((B, *)\).

Return type:

mxnet.NDArray

MaxPooling

class dgl.nn.mxnet.glob.MaxPooling[source]

Bases: mxnet.gluon.block.Block

Apply max pooling over the nodes in the graph.

\[r^{(i)} = \max_{k=1}^{N_i} \left( x^{(i)}_k \right)\]
forward(graph, feat)[source]

Compute max pooling.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature with shape \((N, *)\) where \(N\) is the number of nodes in the graph.
Returns:

The output feature with shape \((*)\) (if input graph is a BatchedDGLGraph, the result shape would be \((B, *)\).

Return type:

mxnet.NDArray

SortPooling

class dgl.nn.mxnet.glob.SortPooling(k)[source]

Bases: mxnet.gluon.block.Block

Apply Sort Pooling (An End-to-End Deep Learning Architecture for Graph Classification) over the nodes in the graph.

Parameters:k (int) – The number of nodes to hold for each graph.
forward(graph, feat)[source]

Compute sort pooling.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature with shape \((N, D)\) where \(N\) is the number of nodes in the graph.
Returns:

The output feature with shape \((k * D)\) (if input graph is a BatchedDGLGraph, the result shape would be \((B, k * D)\).

Return type:

mxnet.NDArray

GlobalAttentionPooling

class dgl.nn.mxnet.glob.GlobalAttentionPooling(gate_nn, feat_nn=None)[source]

Bases: mxnet.gluon.block.Block

Apply Global Attention Pooling (Gated Graph Sequence Neural Networks) over the nodes in the graph.

\[r^{(i)} = \sum_{k=1}^{N_i}\mathrm{softmax}\left(f_{gate} \left(x^{(i)}_k\right)\right) f_{feat}\left(x^{(i)}_k\right)\]
Parameters:
  • gate_nn (gluon.nn.Block) – A neural network that computes attention scores for each feature.
  • feat_nn (gluon.nn.Block, optional) – A neural network applied to each feature before combining them with attention scores.
forward(graph, feat)[source]

Compute global attention pooling.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature with shape \((N, D)\) where \(N\) is the number of nodes in the graph.
Returns:

The output feature with shape \((D)\) (if input graph is a BatchedDGLGraph, the result shape would be \((B, D)\).

Return type:

mxnet.NDArray

Set2Set

class dgl.nn.mxnet.glob.Set2Set(input_dim, n_iters, n_layers)[source]

Bases: mxnet.gluon.block.Block

Apply Set2Set (Order Matters: Sequence to sequence for sets) over the nodes in the graph.

For each individual graph in the batch, set2set computes

\[ \begin{align}\begin{aligned}q_t &= \mathrm{LSTM} (q^*_{t-1})\\\alpha_{i,t} &= \mathrm{softmax}(x_i \cdot q_t)\\r_t &= \sum_{i=1}^N \alpha_{i,t} x_i\\q^*_t &= q_t \Vert r_t\end{aligned}\end{align} \]

for this graph.

Parameters:
  • input_dim (int) – Size of each input sample
  • n_iters (int) – Number of iterations.
  • n_layers (int) – Number of recurrent layers.
forward(graph, feat)[source]

Compute set2set pooling.

Parameters:
  • graph (DGLGraph or BatchedDGLGraph) – The graph.
  • feat (mxnet.NDArray) – The input feature with shape \((N, D)\) where \(N\) is the number of nodes in the graph.
Returns:

The output feature with shape \((D)\) (if input graph is a BatchedDGLGraph, the result shape would be \((B, D)\).

Return type:

mxnet.NDArray

Utility Modules

Edge Softmax

Gluon layer for graph related softmax.

dgl.nn.mxnet.softmax.edge_softmax(graph, logits, eids='__ALL__')[source]

Compute edge softmax.

For a node \(i\), edge softmax is an operation of computing

\[a_{ij} = \frac{\exp(z_{ij})}{\sum_{j\in\mathcal{N}(i)}\exp(z_{ij})}\]

where \(z_{ij}\) is a signal of edge \(j\rightarrow i\), also called logits in the context of softmax. \(\mathcal{N}(i)\) is the set of nodes that have an edge to \(i\).

An example of using edge softmax is in Graph Attention Network where the attention weights are computed with such an edge softmax operation.

Parameters:
  • graph (DGLGraph) – The graph to perform edge softmax
  • logits (mxnet.NDArray) – The input edge feature
  • eids (mxnet.NDArray or ALL, optional) – Edges on which to apply edge softmax. If ALL, apply edge softmax on all edges in the graph. Default: ALL.
Returns:

Softmax value

Return type:

Tensor

Notes

  • Input shape: \((E, *, 1)\) where * means any number of additional dimensions, \(E\) equals the length of eids. If eids is ALL, \(E\) equals number of edges in the graph.
  • Return shape: \((E, *, 1)\)

Examples

>>> from dgl.nn.mxnet.softmax import edge_softmax
>>> import dgl
>>> from mxnet import nd

Create a DGLGraph object and initialize its edge features.

>>> g = dgl.DGLGraph()
>>> g.add_nodes(3)
>>> g.add_edges([0, 0, 0, 1, 1, 2], [0, 1, 2, 1, 2, 2])
>>> edata = nd.ones((6, 1))
>>> edata
[[1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]]
<NDArray 6x1 @cpu(0)>

Apply edge softmax on g:

>>> edge_softmax(g, edata)
[[1.        ]
 [0.5       ]
 [0.33333334]
 [0.5       ]
 [0.33333334]
 [0.33333334]]
<NDArray 6x1 @cpu(0)>

Apply edge softmax on first 4 edges of g:

>>> edge_softmax(g, edata, nd.array([0,1,2,3], dtype='int64'))
[[1. ]
 [0.5]
 [1. ]
 [0.5]]
<NDArray 4x1 @cpu(0)>