COCOSuperpixelsDataset

class dgl.data.COCOSuperpixelsDataset(raw_dir=None, split='train', construct_format='edge_wt_region_boundary', slic_compactness=30, force_reload=None, verbose=None, transform=None)[source]

Bases: dgl.data.dgl_dataset.DGLDataset

COCO superpixel dataset for the node classification task.

DGL dataset of COCO-SP in the LRGB benckmark which contains image superpixels and a semantic segmentation label for each node superpixel.

Based on the COCO 2017 dataset. Original source https://cocodataset.org

Reference https://arxiv.org/abs/2206.08164.pdf

Statistics:

  • Train examples: 113,286

  • Valid examples: 5,000

  • Test examples: 5,000

  • Average number of nodes: 476.88

  • Average number of edges: 2,710.48

  • Number of node classes: 81

Parameters
  • raw_dir (str) – Directory to store all the downloaded raw datasets. Default: “~/.dgl/”.

  • split (str) – Should be chosen from [“train”, “val”, “test”] Default: “train”.

  • construct_format (str, optional) –

    Option to select the graph construction format. Should be chosen from the following formats:

    • ”edge_wt_only_coord”: the graphs are 8-nn graphs with the edge weights computed based on only spatial coordinates of superpixel nodes.

    • ”edge_wt_coord_feat”: the graphs are 8-nn graphs with the edge weights computed based on combination of spatial coordinates and feature values of superpixel nodes.

    • ”edge_wt_region_boundary”: the graphs region boundary graphs where two regions (i.e. superpixel nodes) have an edge between them if they share a boundary in the original image.

    Default: “edge_wt_region_boundary”.

  • slic_compactness (int, optional) – Option to select compactness of slic that was used for superpixels Should be chosen from [10, 30] Default: 30.

  • force_reload (bool) – Whether to reload the dataset. Default: False.

  • verbose (bool) – Whether to print out progress information. Default: False.

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

Examples

>>> from dgl.data import COCOSuperpixelsDataset
>>> train_dataset = COCOSuperpixelsDataset(split="train")
>>> len(train_dataset)
113286
>>> train_dataset.num_classes
81
>>> graph = train_dataset[0]
>>> graph
Graph(num_nodes=488, num_edges=2766,
    ndata_schemes={'feat': Scheme(shape=(14,), dtype=torch.float32),
                    'label': Scheme(shape=(), dtype=torch.uint8)}
    edata_schemes={'feat': Scheme(shape=(2,), dtype=torch.float32)})
>>> # support tensor to be index when transform is None
>>> # see details in __getitem__ function
>>> import torch
>>> idx = torch.tensor([0, 1, 2])
>>> train_dataset_subset = train_dataset[idx]
>>> train_dataset_subset[0]
Graph(num_nodes=488, num_edges=2766,
    ndata_schemes={'feat': Scheme(shape=(14,), dtype=torch.float32),
                    'label': Scheme(shape=(), dtype=torch.uint8)}
    edata_schemes={'feat': Scheme(shape=(2,), dtype=torch.float32)})
__getitem__(idx)[source]

Get the idx-th sample.

Parameters

idx (int or tensor) – The sample index. 1-D tensor as idx is allowed when transform is None.

Returns

  • dgl.DGLGraph – graph structure, node features, node labels and edge features.

    • ndata['feat']: node features

    • ndata['label']: node labels

    • edata['feat']: edge features

  • or

  • dgl.data.utils.Subset – Subset of the dataset at specified indices

__len__()[source]

The number of examples in the dataset.