BatchedDGLHeteroGraph – Enable batched graph operations for heterographs

class dgl.BatchedDGLHeteroGraph(graph_list, node_attrs, edge_attrs)[source]

Class for batched DGLHeteroGraphs.

A BatchedDGLHeteroGraph basically merges a list of small graphs into a giant graph so that one can perform message passing and readout over a batch of graphs simultaneously.

For a given node/edge type, the nodes/edges are re-indexed with a new id in the batched graph with the rule below:

item Graph 1 Graph 2 Graph k
raw id 0, …, N1 0, …, N2 …, Nk
new id 0, …, N1 N1 + 1, …, N1 + N2 + 1 …, N1 + … + Nk + k - 1

To modify the features in BatchedDGLHeteroGraph has no effect on the original graphs. See the examples below about how to work around.

Parameters:
  • graph_list (iterable) – A collection of DGLHeteroGraph to be batched.
  • node_attrs (None or dict) – The node attributes to be batched. If None, the resulted graph will not have features. If dict, it maps str to str or iterable. The keys represent names of node types and the values represent the node features to be batched for the corresponding type. By default, we use all features for all types of nodes.
  • edge_attrs (None or dict) – Same as for the case of node_attrs.

Examples

>>> import dgl
>>> import torch as th

Example 1

We start with a simple example.

>>> # Create the first graph and set features for nodes of type 'user'
>>> g1 = dgl.heterograph({('user', 'plays', 'game'): [(0, 0), (1, 0)]})
>>> g1.nodes['user'].data['h1'] = th.tensor([[0.], [1.]])
>>> # Create the second graph and set features for nodes of type 'user'
>>> g2 = dgl.heterograph({('user', 'plays', 'game'): [(0, 0)]})
>>> g2.nodes['user'].data['h1'] = th.tensor([[0.]])
>>> # Batch the graphs
>>> bg = dgl.batch_hetero([g1, g2])

With the batching operation, the nodes and edges are re-indexed.

>>> bg.nodes('user')
tensor([0, 1, 2])

By default, we also copy and concatenate all the node and edge features.

>>> bg.nodes['user'].data['h1']
tensor([[0.],
        [1.],
        [0.]])

Example 2

We will now see a more complex example and the various operations one can play with a batched graph.

>>> g1 = dgl.heterograph({
...    ('user', 'follows', 'user'): [(0, 1), (1, 2)],
...    ('user', 'plays', 'game'): [(0, 0), (1, 0)]
... })
>>> g1.nodes['user'].data['h1'] = th.tensor([[0.], [1.], [2.]])
>>> g1.nodes['user'].data['h2'] = th.tensor([[3.], [4.], [5.]])
>>> g1.nodes['game'].data['h1'] = th.tensor([[0.]])
>>> g1.edges['plays'].data['h1'] = th.tensor([[0.], [1.]])
>>> g2 = dgl.heterograph({
...    ('user', 'follows', 'user'): [(0, 1), (1, 2)],
...    ('user', 'plays', 'game'): [(0, 0), (1, 0)]
... })
>>> g2.nodes['user'].data['h1'] = th.tensor([[0.], [1.], [2.]])
>>> g2.nodes['user'].data['h2'] = th.tensor([[3.], [4.], [5.]])
>>> g2.nodes['game'].data['h1'] = th.tensor([[0.]])
>>> g2.edges['plays'].data['h1'] = th.tensor([[0.], [1.]])

Merge two DGLHeteroGraph objects into one BatchedDGLHeteroGraph object. When merging a list of graphs, we can choose to include only a subset of the attributes.

>>> # For edge types, only canonical edge types are allowed to avoid ambiguity.
>>> bg = dgl.batch_hetero([g1, g2], node_attrs={'user': ['h1', 'h2'], 'game': None},
...                       edge_attrs={('user', 'plays', 'game'): 'h1'})
>>> list(bg.nodes['user'].data.keys())
['h1', 'h2']
>>> list(bg.nodes['game'].data.keys())
[]
>>> list(bg.edges['follows'].data.keys())
[]
>>> list(bg.edges['plays'].data.keys())
['h1']

We can get a brief summary of the graphs that constitute the batched graph.

>>> bg.batch_size
2
>>> bg.batch_num_nodes('user')
[3, 3]
>>> bg.batch_num_edges(('user', 'plays', 'game'))
[2, 2]

Updating the attributes of the batched graph has no effect on the original graphs.

>>> bg.nodes['game'].data['h1'] = th.tensor([[1.], [1.]])
>>> g2.nodes['game'].data['h1']
tensor([[0.]])

Instead, we can decompose the batched graph back into a list of graphs and use them to replace the original graphs.

>>> g3, g4 = dgl.unbatch_hetero(bg) # returns a list of DGLHeteroGraph objects
>>> g4.nodes['game'].data['h1']
tensor([[1.]])

Merge and decompose

batch_hetero(graph_list[, node_attrs, …]) Batch a collection of DGLHeteroGraph and return a BatchedDGLHeteroGraph object that is independent of the graph_list.
unbatch_hetero(graph) Return the list of heterographs in this batch.

Query batch summary

BatchedDGLHeteroGraph.batch_size Number of graphs in this batch.
BatchedDGLHeteroGraph.batch_num_nodes([ntype]) Return the numbers of nodes of the given type for all heterographs in the batch.
BatchedDGLHeteroGraph.batch_num_edges([etype]) Return the numbers of edges of the given type for all heterographs in the batch.