dgl.sampling.sample_neighbors

dgl.sampling.sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False, copy_ndata=True, copy_edata=True, _dist_training=False)[source]

Sample neighboring edges of the given nodes and return the induced subgraph.

For each node, a number of inbound (or outbound when edge_dir == 'out') edges will be randomly chosen. The graph returned will then contain all the nodes in the original graph, but only the sampled edges.

Node/edge features are not preserved. The original IDs of the sampled edges are stored as the dgl.EID feature in the returned graph.

Parameters
  • g (DGLGraph) – The graph. Must be on CPU.

  • nodes (tensor or dict) –

    Node IDs to sample neighbors from.

    This argument can take a single ID tensor or a dictionary of node types and ID tensors. If a single tensor is given, the graph must only have one type of nodes.

  • fanout (int or dict[etype, int]) –

    The number of edges to be sampled for each node on each edge type.

    This argument can take a single int or a dictionary of edge types and ints. If a single int is given, DGL will sample this number of edges for each node for every edge type.

    If -1 is given for a single edge type, all the neighboring edges with that edge type will be selected.

  • edge_dir (str, optional) –

    Determines whether to sample inbound or outbound edges.

    Can take either in for inbound edges or out for outbound edges.

  • prob (str, optional) –

    Feature name used as the (unnormalized) probabilities associated with each neighboring edge of a node. The feature must have only one element for each edge.

    The features must be non-negative floats, and the sum of the features of inbound/outbound edges for every node must be positive (though they don’t have to sum up to one). Otherwise, the result will be undefined.

  • replace (bool, optional) – If True, sample with replacement.

  • copy_ndata (bool, optional) –

    If True, the node features of the new graph are copied from the original graph. If False, the new graph will not have any node features.

    (Default: True)

  • copy_edata (bool, optional) –

    If True, the edge features of the new graph are copied from the original graph. If False, the new graph will not have any edge features.

    (Default: True)

  • _dist_training (bool, optional) –

    Internal argument. Do not use.

    (Default: False)

Returns

A sampled subgraph containing only the sampled neighboring edges. It is on CPU.

Return type

DGLGraph

Notes

If copy_ndata or copy_edata is True, same tensors are used as the node or edge features of the original graph and the new graph. As a result, users should avoid performing in-place operations on the node features of the new graph to avoid feature corruption.

Examples

Assume that you have the following graph

>>> g = dgl.graph(([0, 0, 1, 1, 2, 2], [1, 2, 0, 1, 2, 0]))

And the weights

>>> g.edata['prob'] = torch.FloatTensor([0., 1., 0., 1., 0., 1.])

To sample one inbound edge for node 0 and node 1:

>>> sg = dgl.sampling.sample_neighbors(g, [0, 1], 1)
>>> sg.edges(order='eid')
(tensor([1, 0]), tensor([0, 1]))
>>> sg.edata[dgl.EID]
tensor([2, 0])

To sample one inbound edge for node 0 and node 1 with probability in edge feature prob:

>>> sg = dgl.sampling.sample_neighbors(g, [0, 1], 1, prob='prob')
>>> sg.edges(order='eid')
(tensor([2, 1]), tensor([0, 1]))

With fanout greater than the number of actual neighbors and without replacement, DGL will take all neighbors instead:

>>> sg = dgl.sampling.sample_neighbors(g, [0, 1], 3)
>>> sg.edges(order='eid')
(tensor([1, 2, 0, 1]), tensor([0, 0, 1, 1]))