utils.get_download_dir() Get the absolute path to the download directory.[, path, overwrite, …]) Download a given URL.
utils.check_sha1(filename, sha1_hash) Check whether the sha1 hash of the file content matches the expected hash.
utils.extract_archive(file, target_dir) Extract archive file.

Dataset Classes

Stanford sentiment treebank dataset

For more information about the dataset, see Sentiment Analysis.

class'train', vocab_file=None)[source]

Stanford Sentiment Treebank dataset.

Each sample is the constituency tree of a sentence. The leaf nodes represent words. The word is a int value stored in the x feature field. The non-leaf node has a special value PAD_WORD in the x field. Each node also has a sentiment annotation: 5 classes (very negative, negative, neutral, positive and very positive). The sentiment label is a int value stored in the y feature field.


This dataset class is compatible with pytorch’s Dataset class.


All the samples will be loaded and preprocessed in the memory first.

  • mode (str, optional) – Can be 'train', 'val', 'test' and specifies which data file to use.
  • vocab_file (str, optional) – Optional vocabulary file.

Get the tree with index idx.

Parameters:idx (int) – Tree index.
Return type:dgl.DGLGraph

Get the number of trees in the dataset.

Returns:Number of trees.
Return type:int