dgl.linkx_homophily

dgl.linkx_homophily(graph, y)[source]

Homophily measure from Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Mathematically it is defined as follows:

\[\frac{1}{C-1} \sum_{k=1}^{C} \max \left(0, \frac{\sum_{v\in C_k}|\{u\in \mathcal{N}(v): y_v = y_u \}|}{\sum_{v\in C_k}|\mathcal{N}(v)|} - \frac{|\mathcal{C}_k|}{|\mathcal{V}|} \right),\]

where \(C\) is the number of node classes, \(C_k\) is the set of nodes that belong to class k, \(\mathcal{N}(v)\) are the predecessors of node \(v\), \(y_v\) is the class of node \(v\), and \(\mathcal{V}\) is the set of nodes.

Parameters:

graph (DGLGraph) – The graph.
y (torch.Tensor) – The node labels, which is a tensor of shape (|V|).

Returns:

The homophily value.

Return type:

float

Examples

>>> import dgl
>>> import torch

>>> graph = dgl.graph(([0, 1, 2, 3], [1, 2, 0, 4]))
>>> y = torch.tensor([0, 0, 0, 0, 1])
>>> dgl.linkx_homophily(graph, y)
0.19999998807907104