dgl.heterograph¶

dgl.
heterograph
(data_dict, num_nodes_dict=None, idtype=None, device=None)[source]¶ Create a heterogeneous graph and return.
 Parameters
data_dict (graph data) –
The dictionary data for constructing a heterogeneous graph. The keys are in the form of string triplets (src_type, edge_type, dst_type), specifying the source node, edge, and destination node types. The values are graph data in the form of \((U, V)\), where \((U[i], V[i])\) forms the edge with ID \(i\). The allowed graph data formats are:
(Tensor, Tensor)
: Each tensor must be a 1D tensor containing node IDs. DGL calls this format “tuple of nodetensors”. The tensors should have the same data type, which must be either int32 or int64. They should also have the same device context (see below the descriptions ofidtype
anddevice
).('coo', (Tensor, Tensor))
: Same as(Tensor, Tensor)
.('csr', (Tensor, Tensor, Tensor))
: The three tensors form the CSR representation of the graph’s adjacency matrix. The first one is the row index pointer. The second one is the column indices. The third one is the edge IDs, which can be empty (i.e. with 0 elements) to represent consecutive integer IDs starting from 0.('csc', (Tensor, Tensor, Tensor))
: The three tensors form the CSC representation of the graph’s adjacency matrix. The first one is the column index pointer. The second one is the row indices. The third one is the edge IDs, which can be empty to represent consecutive integer IDs starting from 0.
The tensors can be replaced with any iterable of integers (e.g. list, tuple, numpy.ndarray).
num_nodes_dict (dict[str, int], optional) – The number of nodes for some node types, which is a dictionary mapping a node type \(T\) to the number of \(T\)typed nodes. If not given for a node type \(T\), DGL finds the largest ID appearing in every graph data whose source or destination node type is \(T\), and sets the number of nodes to be that ID plus one. If given and the value is no greater than the largest ID for some node type, DGL will raise an error. By default, DGL infers the number of nodes for all node types.
idtype (int32 or int64, optional) – The data type for storing the structurerelated graph information such as node and edge IDs. It should be a frameworkspecific data type object (e.g.,
torch.int32
). IfNone
(default), DGL infers the ID type from thedata_dict
argument.device (device context, optional) – The device of the returned graph, which should be a frameworkspecific device object (e.g.,
torch.device
). IfNone
(default), DGL uses the device of the tensors of thedata
argument. Ifdata
is not a tuple of nodetensors, the returned graph is on CPU. If the specifieddevice
differs from that of the provided tensors, it casts the given tensors to the specified device first.
 Returns
The created graph.
 Return type
Notes
If the
idtype
argument is not given then:in the case of the tuple of nodetensor format, DGL uses the data type of the given ID tensors.
in the case of the tuple of sequence format, DGL uses int64.
Once the graph has been created, you can change the data type by using
dgl.DGLGraph.long()
ordgl.DGLGraph.int()
.If the specified
idtype
argument differs from the data type of the provided tensors, it casts the given tensors to the specified data type first.The most efficient construction approach is to provide a tuple of node tensors without specifying
idtype
anddevice
. This is because the returned graph shares the storage with the input nodetensors in this case.DGL internally maintains multiple copies of the graph structure in different sparse formats and chooses the most efficient one depending on the computation invoked. If memory usage becomes an issue in the case of large graphs, use
dgl.DGLGraph.formats()
to restrict the allowed formats.DGL internally decides a deterministic order for the same set of node types and canonical edge types, which does not necessarily follow the order in
data_dict
.
Examples
The following example uses PyTorch backend.
>>> import dgl >>> import torch
Create a heterograph with three canonical edge types.
>>> data_dict = { ... ('user', 'follows', 'user'): (torch.tensor([0, 1]), torch.tensor([1, 2])), ... ('user', 'follows', 'topic'): (torch.tensor([1, 1]), torch.tensor([1, 2])), ... ('user', 'plays', 'game'): (torch.tensor([0, 3]), torch.tensor([3, 4])) ... } >>> g = dgl.heterograph(data_dict) >>> g Graph(num_nodes={'game': 5, 'topic': 3, 'user': 4}, num_edges={('user', 'follows', 'user'): 2, ('user', 'follows', 'topic'): 2, ('user', 'plays', 'game'): 2}, metagraph=[('user', 'user', 'follows'), ('user', 'topic', 'follows'), ('user', 'game', 'plays')])
Explicitly specify the number of nodes for each node type in the graph.
>>> num_nodes_dict = {'user': 4, 'topic': 4, 'game': 6} >>> g = dgl.heterograph(data_dict, num_nodes_dict=num_nodes_dict)
Create a graph on the first GPU with data type int32.
>>> g = dgl.heterograph(data_dict, idtype=torch.int32, device='cuda:0')