class dgl.data.WisconsinDataset(raw_dir=None, force_reload=False, verbose=True, transform=None)[source]

Bases: dgl.data.geom_gcn.GeomGCNDataset

Wisconsin subset of WebKB, later modified by Geom-GCN: Geometric Graph Convolutional Networks

Nodes represent web pages. Edges represent hyperlinks between them. Node features are the bag-of-words representation of web pages. The web pages are manually classified into the five categories, student, project, course, staff, and faculty.


  • Nodes: 251

  • Edges: 515

  • Number of Classes: 5

  • 10 train/val/test splits

    • Train: 120

    • Val: 80

    • Test: 51

  • raw_dir (str, optional) – Raw file directory to store the processed data. Default: ~/.dgl/

  • force_reload (bool, optional) – Whether to re-download the data source. Default: False

  • verbose (bool, optional) – Whether to print progress information. Default: True

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access. Default: None


Number of node classes




The graph does not come with edges for both directions.


>>> from dgl.data import WisconsinDataset
>>> dataset = WisconsinDataset()
>>> g = dataset[0]
>>> num_classes = dataset.num_classes
>>> # get node features
>>> feat = g.ndata["feat"]
>>> # get data split
>>> train_mask = g.ndata["train_mask"]
>>> val_mask = g.ndata["val_mask"]
>>> test_mask = g.ndata["test_mask"]
>>> # get labels
>>> label = g.ndata['label']

Gets the data object at index.


The number of examples in the dataset.