dgl.graph

dgl.graph(data, *, num_nodes=None, idtype=None, device=None, row_sorted=False, col_sorted=False)[source]

Create a graph and return.

Parameters
  • data (graph data) –

    The data for constructing a graph, which takes the form of \((U, V)\). \((U[i], V[i])\) forms the edge with ID \(i\) in the graph. The allowed data formats are:

    • (Tensor, Tensor): Each tensor must be a 1D tensor containing node IDs. DGL calls this format “tuple of node-tensors”. The tensors should have the same data type of int32/int64 and device context (see below the descriptions of idtype and device).

    • ('coo', (Tensor, Tensor)): Same as (Tensor, Tensor).

    • ('csr', (Tensor, Tensor, Tensor)): The three tensors form the CSR representation of the graph’s adjacency matrix. The first one is the row index pointer. The second one is the column indices. The third one is the edge IDs, which can be empty to represent consecutive integer IDs starting from 0.

    • ('csc', (Tensor, Tensor, Tensor)): The three tensors form the CSC representation of the graph’s adjacency matrix. The first one is the column index pointer. The second one is the row indices. The third one is the edge IDs, which can be empty to represent consecutive integer IDs starting from 0.

    The tensors can be replaced with any iterable of integers (e.g. list, tuple, numpy.ndarray).

  • num_nodes (int, optional) – The number of nodes in the graph. If not given, this will be the largest node ID plus 1 from the data argument. If given and the value is no greater than the largest node ID from the data argument, DGL will raise an error.

  • idtype (int32 or int64, optional) – The data type for storing the structure-related graph information such as node and edge IDs. It should be a framework-specific data type object (e.g., torch.int32). If None (default), DGL infers the ID type from the data argument. See “Notes” for more details.

  • device (device context, optional) – The device of the returned graph, which should be a framework-specific device object (e.g., torch.device). If None (default), DGL uses the device of the tensors of the data argument. If data is not a tuple of node-tensors, the returned graph is on CPU. If the specified device differs from that of the provided tensors, it casts the given tensors to the specified device first.

  • row_sorted (bool, optional) – Whether or not the rows of the COO are in ascending order.

  • col_sorted (bool, optional) – Whether or not the columns of the COO are in ascending order within each row. This only has an effect when row_sorted is True.

Returns

The created graph.

Return type

DGLGraph

Notes

  1. If the idtype argument is not given then:

    • in the case of the tuple of node-tensor format, DGL uses the data type of the given ID tensors.

    • in the case of the tuple of sequence format, DGL uses int64.

    Once the graph has been created, you can change the data type by using dgl.DGLGraph.long() or dgl.DGLGraph.int().

    If the specified idtype argument differs from the data type of the provided tensors, it casts the given tensors to the specified data type first.

  2. The most efficient construction approach is to provide a tuple of node tensors without specifying idtype and device. This is because the returned graph shares the storage with the input node-tensors in this case.

  3. DGL internally maintains multiple copies of the graph structure in different sparse formats and chooses the most efficient one depending on the computation invoked. If memory usage becomes an issue in the case of large graphs, use dgl.DGLGraph.formats() to restrict the allowed formats.

Examples

The following example uses PyTorch backend.

>>> import dgl
>>> import torch

Create a small three-edge graph.

>>> # Source nodes for edges (2, 1), (3, 2), (4, 3)
>>> src_ids = torch.tensor([2, 3, 4])
>>> # Destination nodes for edges (2, 1), (3, 2), (4, 3)
>>> dst_ids = torch.tensor([1, 2, 3])
>>> g = dgl.graph((src_ids, dst_ids))

Explicitly specify the number of nodes in the graph.

>>> g = dgl.graph((src_ids, dst_ids), num_nodes=100)

Create a graph on the first GPU with data type int32.

>>> g = dgl.graph((src_ids, dst_ids), idtype=torch.int32, device='cuda:0')

Creating a graph with CSR representation:

>>> g = dgl.graph(('csr', ([0, 0, 0, 1, 2, 3], [1, 2, 3], [])))

Create the same graph with CSR representation and edge IDs.

>>> g = dgl.graph(('csr', ([0, 0, 0, 1, 2, 3], [1, 2, 3], [0, 1, 2])))