dgl.distributed.sample_neighbors¶

dgl.distributed.sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False)[source]¶

Sample from the neighbors of the given nodes from a distributed graph.

For each node, a number of inbound (or outbound when edge_dir == 'out') edges will be randomly chosen. The returned graph will contain all the nodes in the original graph, but only the sampled edges.

Node/edge features are not preserved. The original IDs of the sampled edges are stored as the dgl.EID feature in the returned graph.

For heterogeneous graphs, nodes is a dictionary whose key is node type and the value is type-specific node IDs.

Parameters

g (DistGraph) – The distributed graph..
nodes (tensor or dict) – Node IDs to sample neighbors from. If it’s a dict, it should contain only one key-value pair to make this API consistent with dgl.sampling.sample_neighbors.
fanout (int) –
The number of edges to be sampled for each node.

If -1 is given, all of the neighbors will be selected.
edge_dir (str, optional) –
Determines whether to sample inbound or outbound edges.

Can take either in for inbound edges or out for outbound edges.
prob (str, optional) –
Feature name used as the (unnormalized) probabilities associated with each neighboring edge of a node. The feature must have only one element for each edge.

The features must be non-negative floats, and the sum of the features of inbound/outbound edges for every node must be positive (though they don’t have to sum up to one). Otherwise, the result will be undefined.
replace (bool, optional) –
If True, sample with replacement.

When sampling with replacement, the sampled subgraph could have parallel edges.

For sampling without replacement, if fanout > the number of neighbors, all the neighbors are sampled. If fanout == -1, all neighbors are collected.

Returns

A sampled subgraph containing only the sampled neighboring edges. It is on CPU.

Return type

DGLGraph