dgl.graphbolt.unique_and_compact_csc_formats(csc_formats: Tuple[Tensor, Tensor] | Dict[str, Tuple[Tensor, Tensor]], unique_dst_nodes: Tensor | Dict[str, Tensor])[source]

Compact csc formats and return unique nodes (per type).

  • csc_formats (Union[CSCFormatBase, Dict(str, CSCFormatBase)]) – CSC formats representing source-destination edges. - If csc_formats is a CSCFormatBase: It means the graph is homogeneous. Also, indptr and indice in it should be torch.tensor representing source and destination pairs in csc format. And IDs inside are homogeneous ids. - If csc_formats is a Dict[str, CSCFormatBase]: The keys should be edge type and the values should be csc format node pairs. And IDs inside are heterogeneous ids.

  • unique_dst_nodes (torch.Tensor or Dict[str, torch.Tensor]) – Unique nodes of all destination nodes in the node pairs. - If unique_dst_nodes is a tensor: It means the graph is homogeneous. - If csc_formats is a dictionary: The keys are node type and the values are corresponding nodes. And IDs inside are heterogeneous ids.


The compacted csc formats, where node IDs are replaced with mapped node IDs, and the unique nodes (per type). β€œCompacted csc formats” indicates that the node IDs in the input node pairs are replaced with mapped node IDs, where each type of node is mapped to a contiguous space of IDs ranging from 0 to N.

Return type:

Tuple[csc_formats, unique_nodes]


>>> import dgl.graphbolt as gb
>>> N1 = torch.LongTensor([1, 2, 2])
>>> N2 = torch.LongTensor([5, 5, 6])
>>> unique_dst = {
...     "n1": torch.LongTensor([1, 2]),
...     "n2": torch.LongTensor([5, 6])}
>>> csc_formats = {
...     "n1:e1:n2": gb.CSCFormatBase(indptr=torch.tensor([0, 2, 3]),indices=N1),
...     "n2:e2:n1": gb.CSCFormatBase(indptr=torch.tensor([0, 1, 3]),indices=N2)}
>>> unique_nodes, compacted_csc_formats = gb.unique_and_compact_csc_formats(
...     csc_formats, unique_dst
... )
>>> print(unique_nodes)
{'n1': tensor([1, 2]), 'n2': tensor([5, 6])}
>>> print(compacted_csc_formats)
{"n1:e1:n2": CSCFormatBase(indptr=torch.tensor([0, 2, 3]),
                           indices=torch.tensor([0, 1, 1])),
 "n2:e2:n1": CSCFormatBase(indptr=torch.tensor([0, 1, 3]),
                           indices=torch.Longtensor([0, 0, 1]))}