COCOSuperpixelsDataset¶

class dgl.data.COCOSuperpixelsDataset(raw_dir=None, split='train', construct_format='edge_wt_region_boundary', slic_compactness=30, force_reload=None, verbose=None, transform=None)[source]¶

Bases: dgl.data.dgl_dataset.DGLDataset

COCO superpixel dataset for the node classification task.

DGL dataset of COCO-SP in the LRGB benckmark which contains image superpixels and a semantic segmentation label for each node superpixel.

Based on the COCO 2017 dataset. Original source https://cocodataset.org

Reference https://arxiv.org/abs/2206.08164.pdf

Statistics:

Train examples: 113,286
Valid examples: 5,000
Test examples: 5,000
Average number of nodes: 476.88
Average number of edges: 2,710.48
Number of node classes: 81

Parameters

raw_dir (str) – Directory to store all the downloaded raw datasets. Default: “~/.dgl/”.
split (str) – Should be chosen from [“train”, “val”, “test”] Default: “train”.
construct_format (str, optional) –
Option to select the graph construction format. Should be chosen from the following formats:
- ”edge_wt_only_coord”: the graphs are 8-nn graphs with the edge weights computed based on only spatial coordinates of superpixel nodes.
- ”edge_wt_coord_feat”: the graphs are 8-nn graphs with the edge weights computed based on combination of spatial coordinates and feature values of superpixel nodes.
- ”edge_wt_region_boundary”: the graphs region boundary graphs where two regions (i.e. superpixel nodes) have an edge between them if they share a boundary in the original image.
Default: “edge_wt_region_boundary”.
slic_compactness (int, optional) – Option to select compactness of slic that was used for superpixels Should be chosen from [10, 30] Default: 30.
force_reload (bool) – Whether to reload the dataset. Default: False.
verbose (bool) – Whether to print out progress information. Default: False.
transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

Examples

>>> from dgl.data import COCOSuperpixelsDataset

>>> train_dataset = COCOSuperpixelsDataset(split="train")
>>> len(train_dataset)
113286
>>> train_dataset.num_classes
81
>>> graph = train_dataset[0]
>>> graph
Graph(num_nodes=488, num_edges=2766,
    ndata_schemes={'feat': Scheme(shape=(14,), dtype=torch.float32),
                    'label': Scheme(shape=(), dtype=torch.uint8)}
    edata_schemes={'feat': Scheme(shape=(2,), dtype=torch.float32)})

>>> # support tensor to be index when transform is None
>>> # see details in __getitem__ function
>>> import torch
>>> idx = torch.tensor([0, 1, 2])
>>> train_dataset_subset = train_dataset[idx]
>>> train_dataset_subset[0]
Graph(num_nodes=488, num_edges=2766,
    ndata_schemes={'feat': Scheme(shape=(14,), dtype=torch.float32),
                    'label': Scheme(shape=(), dtype=torch.uint8)}
    edata_schemes={'feat': Scheme(shape=(2,), dtype=torch.float32)})

__getitem__(idx)[source]¶

Get the idx-th sample.

Parameters

idx (int or tensor) – The sample index. 1-D tensor as idx is allowed when transform is None.

Returns

dgl.DGLGraph – graph structure, node features, node labels and edge features.
- ndata['feat']: node features
- ndata['label']: node labels
- edata['feat']: edge features
or
dgl.data.utils.Subset – Subset of the dataset at specified indices

__len__()[source]¶: The number of examples in the dataset.