DGLDataset¶
-
class
dgl.data.
DGLDataset
(name, url=None, raw_dir=None, save_dir=None, hash_key=(), force_reload=False, verbose=False, transform=None)[source]¶ Bases:
object
The basic DGL dataset for creating graph datasets. This class defines a basic template class for DGL Dataset. The following steps will be executed automatically:
Check whether there is a dataset cache on disk (already processed and stored on the disk) by invoking
has_cache()
. If true, goto 5.Call
download()
to download the data ifurl
is not None.Call
process()
to process the data.Call
save()
to save the processed dataset on disk and goto 6.Call
load()
to load the processed dataset from disk.Done.
Users can overwite these functions with their own data processing logic.
- Parameters
name (str) – Name of the dataset
url (str) – Url to download the raw dataset. Default: None
raw_dir (str) – Specifying the directory that will store the downloaded data or the directory that already stores the input data. Default: ~/.dgl/
save_dir (str) – Directory to save the processed dataset. Default: same as raw_dir
hash_key (tuple) – A tuple of values as the input for the hash function. Users can distinguish instances (and their caches on the disk) from the same dataset class by comparing the hash values. Default: (), the corresponding hash value is
'f9065fa7'
.force_reload (bool) – Whether to reload the dataset. Default: False
verbose (bool) – Whether to print out progress information
transform (callable, optional) – A transform that takes in a
DGLGraph
object and returns a transformed version. TheDGLGraph
object will be transformed before every access.
-
raw_path
¶ Path to the downloaded raw dataset folder. An alias for
os.path.join(self.raw_dir, self.name)
.- Type
-
save_path
¶ Path to the processed dataset folder. An alias for
os.path.join(self.save_dir, self.name)
.- Type