class, feat_dim, num_heads=1)[source]

Bases: torch.nn.modules.module.Module

Path Encoder, as introduced in Edge Encoding of Do Transformers Really Perform Bad for Graph Representation?

This module is a learnable path embedding module and encodes the shortest path between each pair of nodes as attention bias.

  • max_len (int) – Maximum number of edges in each path to be encoded. Exceeding part of each path will be truncated, i.e. truncating edges with serial number no less than max_len.

  • feat_dim (int) – Dimension of edge features in the input graph.

  • num_heads (int, optional) – Number of attention heads if multi-head attention mechanism is applied. Default : 1.


>>> import torch as th
>>> import dgl
>>> from dgl.nn import PathEncoder
>>> u = th.tensor([0, 0, 0, 1, 1, 2, 3, 3])
>>> v = th.tensor([1, 2, 3, 0, 3, 0, 0, 1])
>>> g = dgl.graph((u, v))
>>> edata = th.rand(8, 16)
>>> path_encoder = PathEncoder(2, 16, num_heads=8)
>>> out = path_encoder(g, edata)
forward(g, edge_feat)[source]
  • g (DGLGraph) – A DGLGraph to be encoded, which must be a homogeneous one.

  • edge_feat (torch.Tensor) – The input edge feature of shape \((E, d)\), where \(E\) is the number of edges in the input graph and \(d\) is feat_dim.


Return attention bias as path encoding, of shape \((B, N, N, H)\), where \(B\) is the batch size of the input graph, \(N\) is the maximum number of nodes, and \(H\) is num_heads.

Return type