MiniBatch
- class dgl.graphbolt.MiniBatch(labels: Tensor | Dict[str, Tensor] | None = None, seeds: Tensor | Dict[str, Tensor] | None = None, indexes: Tensor | Dict[str, Tensor] | None = None, sampled_subgraphs: List[SampledSubgraph] | None = None, input_nodes: Tensor | Dict[str, Tensor] | None = None, node_features: Dict[str, Tensor] | Dict[Tuple[str, str], Tensor] | None = None, edge_features: List[Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]] | None = None, compacted_seeds: Tensor | Dict[str, Tensor] | None = None)[source]
Bases:
object
A composite data class for data structure in the graphbolt.
It is designed to facilitate the exchange of data among different components involved in processing data. The purpose of this class is to unify the representation of input and output data across different stages, ensuring consistency and ease of use throughout the loading process.
- node_ids() Tensor | Dict[str, Tensor] [source]
A representation of input nodes in the outermost layer. Contains all nodes in the sampled_subgraphs. - If input_nodes is a tensor: It indicates the graph is homogeneous. - If input_nodes is a dictionary: The keys should be node type and the
value should be corresponding heterogeneous node id.
- set_edge_features(edge_features: List[Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]]) None [source]
Set edge features.
- set_node_features(node_features: Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]) None [source]
Set node features.
- to_pyg_data()[source]
Construct a PyG Data from MiniBatch. This function only supports node classification task on a homogeneous graph and the number of features cannot be more than one.
- property blocks
Extracts DGL blocks from MiniBatch to construct a graphical structure and ID mappings.
- compacted_seeds: Tensor | Dict[str, Tensor] = None
Representation of compacted seeds corresponding to ‘seeds’, where all node ids inside are compacted.
- edge_features: List[Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]] = None
Edge features associated with the ‘sampled_subgraphs’. - If keys are single strings: It means the graph is homogeneous, and the keys are feature names. - If keys are tuples: It means the graph is heterogeneous, and the keys are tuples of ‘(edge_type, feature_name)’. Note, edge type is single string of format ‘str:str:str’.
- indexes: Tensor | Dict[str, Tensor] = None
Indexes associated with seeds in the graph, which indicates to which query a seeds belongs. - If indexes is a tensor: It indicates the graph is homogeneous. The
value should be corresponding query to given ‘seeds’.
If indexes is a dictionary: It indicates the graph is heterogeneous. The keys should be node or edge type and the value should be corresponding query to given ‘seeds’. For each key, indexes are consecutive integers starting from zero.
- input_nodes: Tensor | Dict[str, Tensor] = None
- A representation of input nodes in the outermost layer. Conatins all nodes
in the ‘sampled_subgraphs’.
If input_nodes is a tensor: It indicates the graph is homogeneous.
If input_nodes is a dictionary: The keys should be node type and the value should be corresponding heterogeneous node id.
- labels: Tensor | Dict[str, Tensor] = None
Labels associated with seeds in the graph. - If labels is a tensor: It indicates the graph is homogeneous. The value
should be corresponding labels to given ‘seeds’.
If labels is a dictionary: The keys should be node or edge type and the value should be corresponding labels to given ‘seeds’.
- node_features: Dict[str, Tensor] | Dict[Tuple[str, str], Tensor] = None
A representation of node features. - If keys are single strings: It means the graph is homogeneous, and the keys are feature names. - If keys are tuples: It means the graph is heterogeneous, and the keys are tuples of ‘(node_type, feature_name)’.
- sampled_subgraphs: List[SampledSubgraph] = None
A list of ‘SampledSubgraph’s, each one corresponding to one layer, representing a subset of a larger graph structure.
- seeds: Tensor | Dict[str, Tensor] = None
Representation of seed items utilized in node classification tasks, link prediction tasks and hyperlinks tasks. - If seeds is a tensor: it indicates that the seeds originate from a
homogeneous graph. It can be either a 1-dimensional or 2-dimensional tensor:
1-dimensional tensor: Each element directly represents a seed node within the graph.
2-dimensional tensor: Each row designates a seed item, which can encompass various entities such as edges, hyperlinks, or other graph components depending on the specific context.
If seeds is a dictionary: it indicates that the seeds originate from a heterogeneous graph. The keys should be edge or node type, and the value should be a tensor, which can be either a 1-dimensional or 2-dimensional tensor:
1-dimensional tensor: Each element directly represents a seed node
of the given type within the graph. - 2-dimensional tensor: Each row designates a seed item of the given
type, which can encompass various entities such as edges, hyperlinks, or other graph components depending on the specific context.