dgl.sampling.random_walk¶

dgl.sampling.
random_walk
(g, nodes, *, metapath=None, length=None, prob=None, restart_prob=None)[source]¶ Generate random walk traces from an array of starting nodes based on the given metapath.
For a single starting node,
num_traces
traces would be generated. A trace wouldStart from the given node and set
t
to 0.Pick and traverse along edge type
metapath[t]
from the current node.If no edge can be found, halt. Otherwise, increment
t
and go to step 2.
The returned traces all have length
len(metapath) + 1
, where the first node is the starting node itself.If a random walk stops in advance, DGL pads the trace with 1 to have the same length.
 Parameters
g (DGLGraph) – The graph. Must be on CPU.
nodes (Tensor) –
Node ID tensor from which the random walk traces starts.
The tensor must be on CPU, and must have the same dtype as the ID type of the graph.
metapath (list[str or tuple of str], optional) –
Metapath, specified as a list of edge types.
Mutually exclusive with
length
.If omitted, DGL assumes that
g
only has one node & edge type. In this case, the argumentlength
specifies the length of random walk traces.length (int, optional) –
Length of random walks.
Mutually exclusive with
metapath
.Only used when
metapath
is None.prob (str, optional) –
The name of the edge feature tensor on the graph storing the (unnormalized) probabilities associated with each edge for choosing the next node.
The feature tensor must be nonnegative and the sum of the probabilities must be positive for the outbound edges of all nodes (although they don’t have to sum up to one). The result will be undefined otherwise.
If omitted, DGL assumes that the neighbors are picked uniformly.
restart_prob (float or Tensor, optional) –
Probability to terminate the current trace before each transition.
If a tensor is given,
restart_prob
should have the same length asmetapath
orlength
.
 Returns
traces (Tensor) – A 2dimensional node ID tensor with shape
(num_seeds, len(metapath) + 1)
or(num_seeds, length + 1)
ifmetapath
is None.types (Tensor) – A 1dimensional node type ID tensor with shape
(len(metapath) + 1)
or(length + 1)
. The type IDs match the ones in the original graphg
.
Notes
The returned tensors are on CPU.
Examples
The following creates a homogeneous graph: >>> g1 = dgl.graph(([0, 1, 1, 2, 3], [1, 2, 3, 0, 0]))
Normal random walk:
>>> dgl.sampling.random_walk(g1, [0, 1, 2, 0], length=4) (tensor([[0, 1, 2, 0, 1], [1, 3, 0, 1, 3], [2, 0, 1, 3, 0], [0, 1, 2, 0, 1]]), tensor([0, 0, 0, 0, 0]))
The first tensor indicates the random walk path for each seed node. The jth element in the second tensor indicates the node type ID of the jth node in every path. In this case, it is returning all 0.
Random walk with restart:
>>> dgl.sampling.random_walk_with_restart(g1, [0, 1, 2, 0], length=4, restart_prob=0.5) (tensor([[ 0, 1, 1, 1, 1], [ 1, 3, 0, 1, 1], [ 2, 1, 1, 1, 1], [ 0, 1, 1, 1, 1]]), tensor([0, 0, 0, 0, 0]))
Nonuniform random walk:
>>> g1.edata['p'] = torch.FloatTensor([1, 0, 1, 1, 1]) # disallow going from 1 to 2 >>> dgl.sampling.random_walk(g1, [0, 1, 2, 0], length=4, prob='p') (tensor([[0, 1, 3, 0, 1], [1, 3, 0, 1, 3], [2, 0, 1, 3, 0], [0, 1, 3, 0, 1]]), tensor([0, 0, 0, 0, 0]))
Metapathbased random walk:
>>> g2 = dgl.heterograph({ ... ('user', 'follow', 'user'): ([0, 1, 1, 2, 3], [1, 2, 3, 0, 0]), ... ('user', 'view', 'item'): ([0, 0, 1, 2, 3, 3], [0, 1, 1, 2, 2, 1]), ... ('item', 'viewedby', 'user'): ([0, 1, 1, 2, 2, 1], [0, 0, 1, 2, 3, 3]) >>> dgl.sampling.random_walk( ... g2, [0, 1, 2, 0], metapath=['follow', 'view', 'viewedby'] * 2) (tensor([[0, 1, 1, 1, 2, 2, 3], [1, 3, 1, 1, 2, 2, 2], [2, 0, 1, 1, 3, 1, 1], [0, 1, 1, 0, 1, 1, 3]]), tensor([0, 0, 1, 0, 0, 1, 0]))
Metapathbased random walk, with restarts only on items (i.e. after traversing a “view” relationship):
>>> dgl.sampling.random_walk( ... g2, [0, 1, 2, 0], metapath=['follow', 'view', 'viewedby'] * 2, ... restart_prob=torch.FloatTensor([0, 0.5, 0, 0, 0.5, 0])) (tensor([[ 0, 1, 1, 1, 1, 1, 1], [ 1, 3, 1, 0, 1, 1, 0], [ 2, 0, 1, 1, 3, 2, 2], [ 0, 1, 1, 3, 0, 0, 0]]), tensor([0, 0, 1, 0, 0, 1, 0]))