class dgl.nn.pytorch.conv.PNAConv(in_size, out_size, aggregators, scalers, delta, dropout=0.0, num_towers=1, edge_feat_size=0, residual=True)[source]

Bases: torch.nn.modules.module.Module

Principal Neighbourhood Aggregation Layer from Principal Neighbourhood Aggregation for Graph Nets

A PNA layer is composed of multiple PNA towers. Each tower takes as input a split of the input features, and computes the message passing as below.

\[h_i^(l+1) = U(h_i^l, \oplus_{(i,j)\in E}M(h_i^l, e_{i,j}, h_j^l))\]

where \(h_i\) and \(e_{i,j}\) are node features and edge features, respectively. \(M\) and \(U\) are MLPs, taking the concatenation of input for computing output features. \(\oplus\) represents the combination of various aggregators and scalers. Aggregators aggregate messages from neighbours and scalers scale the aggregated messages in different ways. \(\oplus\) concatenates the output features of each combination.

The output of multiple towers are concatenated and fed into a linear mixing layer for the final output.

  • in_size (int) – Input feature size; i.e. the size of \(h_i^l\).

  • out_size (int) – Output feature size; i.e. the size of \(h_i^{l+1}\).

  • aggregators (list of str) –

    List of aggregation function names(each aggregator specifies a way to aggregate messages from neighbours), selected from:

    • mean: the mean of neighbour messages

    • max: the maximum of neighbour messages

    • min: the minimum of neighbour messages

    • std: the standard deviation of neighbour messages

    • var: the variance of neighbour messages

    • sum: the sum of neighbour messages

    • moment3, moment4, moment5: the normalized moments aggregation


  • scalers (list of str) –

    List of scaler function names, selected from:

    • identity: no scaling

    • amplification: multiply the aggregated message by \(\log(d+1)/\delta\),

    where \(d\) is the degree of the node.

    • attenuation: multiply the aggregated message by \(\delta/\log(d+1)\)

  • delta (float) – The degree-related normalization factor computed over the training set, used by scalers for normalization. \(E[\log(d+1)]\), where \(d\) is the degree for each node in the training set.

  • dropout (float, optional) – The dropout ratio. Default: 0.0.

  • num_towers (int, optional) – The number of towers used. Default: 1. Note that in_size and out_size must be divisible by num_towers.

  • edge_feat_size (int, optional) – The edge feature size. Default: 0.

  • residual (bool, optional) – The bool flag that determines whether to add a residual connection for the output. Default: True. If in_size and out_size of the PNA conv layer are not the same, this flag will be set as False forcibly.


>>> import dgl
>>> import torch as th
>>> from dgl.nn import PNAConv
>>> g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3]))
>>> feat = th.ones(6, 10)
>>> conv = PNAConv(10, 10, ['mean', 'max', 'sum'], ['identity', 'amplification'], 2.5)
>>> ret = conv(g, feat)
forward(graph, node_feat, edge_feat=None)[source]

Compute PNA layer.

  • graph (DGLGraph) – The graph.

  • node_feat (torch.Tensor) – The input feature of shape \((N, h_n)\). \(N\) is the number of nodes, and \(h_n\) must be the same as in_size.

  • edge_feat (torch.Tensor, optional) – The edge feature of shape \((M, h_e)\). \(M\) is the number of edges, and \(h_e\) must be the same as edge_feat_size.


The output node feature of shape \((N, h_n')\) where \(h_n'\) should be the same as out_size.

Return type