dgl.distributed.partition.partition_graph

dgl.distributed.partition.partition_graph(g, graph_name, num_parts, out_path, num_hops=1, part_method='metis', reshuffle=True, balance_ntypes=None, balance_edges=False)[source]

Partition a graph for distributed training and store the partitions on files.

The partitioning occurs in three steps: 1) run a partition algorithm (e.g., Metis) to assign nodes to partitions; 2) construct partition graph structure based on the node assignment; 3) split the node features and edge features based on the partition result.

When a graph is partitioned, each partition can contain HALO nodes and edges, which are the ones that belong to other partitions but are included in this partition for integrity or efficiency concerns. In this document, local nodes/edges refers to the nodes and edges that truly belong to a partition. The rest are “HALO nodes/edges”.

The partitioned data is stored into multiple files organized as follows:

data_root_dir/
  |-- graph_name.json     # partition configuration file in JSON
  |-- node_map.npy        # partition id of each node stored in a numpy array (optional)
  |-- edge_map.npy        # partition id of each edge stored in a numpy array (optional)
  |-- part0/              # data for partition 0
      |-- node_feats.dgl  # node features stored in binary format
      |-- edge_feats.dgl  # edge features stored in binary format
      |-- graph.dgl       # graph structure of this partition stored in binary format
  |-- part1/              # data for partition 1
      |-- node_feats.dgl
      |-- edge_feats.dgl
      |-- graph.dgl

First, the metadata of the original graph and the partitioning is stored in a JSON file named after graph_name. This JSON file contains the information of the original graph as well as the path of the files that store each partition. Below show an example.

{
   "graph_name" : "test",
   "part_method" : "metis",
   "num_parts" : 2,
   "halo_hops" : 1,
   "node_map" : "data_root_dir/node_map.npy",
   "edge_map" : "data_root_dir/edge_map.npy"
   "num_nodes" : 1000000,
   "num_edges" : 52000000,
   "part-0" : {
     "node_feats" : "data_root_dir/part0/node_feats.dgl",
     "edge_feats" : "data_root_dir/part0/edge_feats.dgl",
     "part_graph" : "data_root_dir/part0/graph.dgl",
   },
   "part-1" : {
     "node_feats" : "data_root_dir/part1/node_feats.dgl",
     "edge_feats" : "data_root_dir/part1/edge_feats.dgl",
     "part_graph" : "data_root_dir/part1/graph.dgl",
   },
}

Here are the definition of the fields in the partition configuration file:

  • graph_name is the name of the graph given by a user.

  • part_method is the method used to assign nodes to partitions. Currently, it supports “random” and “metis”.

  • num_parts is the number of partitions.

  • halo_hops is the number of HALO nodes we want to include in a partition.

  • node_map is the node assignment map, which tells the partition Id a node is assigned to.

  • edge_map is the edge assignment map, which tells the partition Id an edge is assigned to.

  • num_nodes is the number of nodes in the global graph.

  • num_edges is the number of edges in the global graph.

  • part-* stores the data of a partition.

If node IDs and edge IDs are not shuffled to ensure that all nodes/edges in a partition fall into a contiguous ID range, DGL needs to store node/edge mappings (from node/edge IDs to partition IDs) in separate files (node_map.npy and edge_map.npy). The node/edge mappings are stored in numpy files.

The graph structure of a partition is stored in a file with the DGLGraph format. Nodes in each partition is relabeled to always start with zero. We call the node ID in the original graph, global ID, while the relabeled ID in each partition, local ID. Each partition graph has an integer node data tensor stored under name dgl.NID and each value is the node’s global ID. Similarly, edges are relabeled too and the mapping from local ID to global ID is stored as an integer edge data tensor under name dgl.EID.

The partition graph contains additional node data (“inner_node” and “orig_id”) and edge data (“inner_edge”):

  • “inner_node” indicates whether a node belongs to a partition.

  • “inner_edge” indicates whether an edge belongs to a partition.

  • “orig_id” exists when reshuffle=True. It indicates the original node Ids in the original

graph before reshuffling.

Node and edge features are splitted and stored together with each graph partition. All node/edge features in a partition are stored in a file with DGL format. The node/edge features are stored in dictionaries, in which the key is the node/edge data name and the value is a tensor. We do not store features of HALO nodes and edges.

When performing Metis partitioning, we can put some constraint on the partitioning. Current, it supports two constrants to balance the partitioning. By default, Metis always tries to balance the number of nodes in each partition.

  • balance_ntypes balances the number of nodes of different types in each partition.

  • balance_edges balances the number of edges in each partition.

To balance the node types, a user needs to pass a vector of N elements to indicate the type of each node. N is the number of nodes in the input graph.

Parameters
  • g (DGLGraph) – The input graph to partition

  • graph_name (str) – The name of the graph. The name will be used to construct DistGraph().

  • num_parts (int) – The number of partitions

  • out_path (str) – The path to store the files for all partitioned data.

  • num_hops (int, optional) – The number of hops of HALO nodes we construct on a partition graph structure. The default value is 1.

  • part_method (str, optional) – The partition method. It supports “random” and “metis”. The default value is “metis”.

  • reshuffle (bool, optional) – Reshuffle nodes and edges so that nodes and edges in a partition are in contiguous Id range. The default value is True

  • balance_ntypes (tensor, optional) – Node type of each node. This is a 1D-array of integers. Its values indicates the node type of each node. This argument is used by Metis partition. When the argument is specified, the Metis algorithm will try to partition the input graph into partitions where each partition has roughly the same number of nodes for each node type. The default value is None, which means Metis partitions the graph to only balance the number of nodes.

  • balance_edges (bool) – Indicate whether to balance the edges in each partition. This argument is used by the Metis algorithm.

Examples

>>> dgl.distributed.partition_graph(g, 'test', 4, num_hops=1, part_method='metis',
...                                 out_path='output/', reshuffle=True,
...                                 balance_ntypes=g.ndata['train_mask'],
...                                 balance_edges=True)
>>> g, node_feats, edge_feats, gpb, graph_name = dgl.distributed.load_partition(
...                                 'output/test.json', 0)