dgl.distributed.graph_services.sample_neighbors¶
-
dgl.distributed.graph_services.
sample_neighbors
(g, nodes, fanout, edge_dir='in', prob=None, replace=False)[source]¶ Sample from the neighbors of the given nodes from a distributed graph.
For each node, a number of inbound (or outbound when
edge_dir == 'out'
) edges will be randomly chosen. The returned graph will contain all the nodes in the original graph, but only the sampled edges.Node/edge features are not preserved. The original IDs of the sampled edges are stored as the dgl.EID feature in the returned graph.
This version provides an experimental support for heterogeneous graphs. When the input graph is heterogeneous, the sampled subgraph is still stored in the homogeneous graph format. That is, all nodes and edges are assigned with unique IDs (in contrast, we typically use a type name and a node/edge ID to identify a node or an edge in
DGLGraph
). We refer to this type of IDs as homogeneous ID. Users can usedgl.distributed.GraphPartitionBook.map_to_per_ntype()
anddgl.distributed.GraphPartitionBook.map_to_per_etype()
to identify their node/edge types and node/edge IDs of that type.For heterogeneous graphs,
nodes
can be a dictionary whose key is node type and the value is type-specific node IDs;nodes
can also be a tensor of homogeneous ID.- Parameters
g (DistGraph) – The distributed graph..
nodes (tensor or dict) – Node IDs to sample neighbors from. If it’s a dict, it should contain only one key-value pair to make this API consistent with dgl.sampling.sample_neighbors.
fanout (int) –
The number of edges to be sampled for each node.
If -1 is given, all of the neighbors will be selected.
edge_dir (str, optional) –
Determines whether to sample inbound or outbound edges.
Can take either
in
for inbound edges orout
for outbound edges.prob (str, optional) –
Feature name used as the (unnormalized) probabilities associated with each neighboring edge of a node. The feature must have only one element for each edge.
The features must be non-negative floats, and the sum of the features of inbound/outbound edges for every node must be positive (though they don’t have to sum up to one). Otherwise, the result will be undefined.
replace (bool, optional) –
If True, sample with replacement.
When sampling with replacement, the sampled subgraph could have parallel edges.
For sampling without replacement, if fanout > the number of neighbors, all the neighbors are sampled. If fanout == -1, all neighbors are collected.
- Returns
A sampled subgraph containing only the sampled neighboring edges. It is on CPU.
- Return type