dgl.sort_csr_by_tag¶
-
dgl.
sort_csr_by_tag
(g, tag, tag_offset_name='_TAG_OFFSET')[source]¶ Return a new graph whose CSR matrix is sorted by the given tag.
Sort the internal CSR matrix of the graph so that the adjacency list of each node , which contains the out-edges, is sorted by the tag of the out-neighbors. After sorting, edges sharing the same tag will be arranged in a consecutive range in a node’s adjacency list. Following is an example:
Consider a graph as follows:
0 -> 0, 1, 2, 3, 4 1 -> 0, 1, 2
Given node tags
[1, 1, 0, 2, 0]
, each node’s adjacency list will be sorted as follows:0 -> 2, 4, 0, 1, 3 1 -> 2, 0, 1
The function will also returns the starting offsets of the tag segments in a tensor of shape \((N, max\_tag+2)\). For node
i
, its out-edges connecting to node tagj
is stored betweentag_offsets[i][j]
~tag_offsets[i][j+1]
. Since the offsets can be viewed node data, we store it in thendata
of the returned graph. Users can specify the ndata name by thetag_pos_name
argument.Note that the function will not change the edge ID neither how the edge features are stored. The input graph must allow CSR format. The graph must be on CPU.
If the input graph is heterogenous, it must have only one edge type and two node types (i.e., source and destination node types). In this case, the provided node tags are for the destination nodes, and the tag offsets are stored in the source node data.
The sorted graph and the calculated tag offsets are needed by certain operators that consider node tags. See
sample_neighbors_biased()
for an example.- Parameters
- Returns
g_sorted – A new graph whose CSR is sorted. The node/edge features of the input graph is shallow-copied over.
g_sorted.ndata[tag_offset_name]
: Tensor of shape \((N, max\_tag + 2)\).If
g
is heterogeneous, get fromg_sorted.srcdata
.
- Return type
Examples
>>> g = dgl.graph(([0,0,0,0,0,1,1,1],[0,1,2,3,4,0,1,2])) >>> g.adjacency_matrix(scipy_fmt='csr').nonzero() (array([0, 0, 0, 0, 0, 1, 1, 1], dtype=int32), array([0, 1, 2, 3, 4, 0, 1, 2], dtype=int32)) >>> tag = torch.IntTensor([1,1,0,2,0]) >>> g_sorted = dgl.sort_csr_by_tag(g, tag) >>> g_sorted.adjacency_matrix(scipy_fmt='csr').nonzero() (array([0, 0, 0, 0, 0, 1, 1, 1], dtype=int32), array([2, 4, 0, 1, 3, 2, 0, 1], dtype=int32)) >>> g_sorted.ndata['_TAG_OFFSET'] tensor([[0, 2, 4, 5], [0, 1, 3, 3], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]])