dgl.distributed.graph_services.sample_neighbors

dgl.distributed.graph_services.sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False)[source]

Sample from the neighbors of the given nodes from a distributed graph.

For each node, a number of inbound (or outbound when edge_dir == 'out') edges will be randomly chosen. The returned graph will contain all the nodes in the original graph, but only the sampled edges.

Node/edge features are not preserved. The original IDs of the sampled edges are stored as the dgl.EID feature in the returned graph.

For now, we only support the input graph with one node type and one edge type.

Parameters
  • g (DistGraph) – The distributed graph..

  • nodes (tensor or dict) – Node IDs to sample neighbors from. If it’s a dict, it should contain only one key-value pair to make this API consistent with dgl.sampling.sample_neighbors.

  • fanout (int) –

    The number of edges to be sampled for each node.

    If -1 is given, all of the neighbors will be selected.

  • edge_dir (str, optional) –

    Determines whether to sample inbound or outbound edges.

    Can take either in for inbound edges or out for outbound edges.

  • prob (str, optional) –

    Feature name used as the (unnormalized) probabilities associated with each neighboring edge of a node. The feature must have only one element for each edge.

    The features must be non-negative floats, and the sum of the features of inbound/outbound edges for every node must be positive (though they don’t have to sum up to one). Otherwise, the result will be undefined.

  • replace (bool, optional) –

    If True, sample with replacement.

    When sampling with replacement, the sampled subgraph could have parallel edges.

    For sampling without replacement, if fanout > the number of neighbors, all the neighbors are sampled. If fanout == -1, all neighbors are collected.

Returns

A sampled subgraph containing only the sampled neighboring edges. It is on CPU.

Return type

DGLGraph