gigl.common.data.load_torch_tensors#
Attributes#
Classes#
| Stores information for all entities. If homogeneous, all types are of type SerializedTFRecordInfo. Otherwise, they are dictionaries with the corresponding mapping. | 
Functions#
| 
 | Loads all torch tensors from a SerializedGraphMetadata object for all entity [node, edge, positive_label, negative_label] and edge / node types. | 
Module Contents#
- class gigl.common.data.load_torch_tensors.SerializedGraphMetadata[source]#
- Stores information for all entities. If homogeneous, all types are of type SerializedTFRecordInfo. Otherwise, they are dictionaries with the corresponding mapping. - edge_entity_info: gigl.common.data.dataloaders.SerializedTFRecordInfo | dict[gigl.src.common.types.graph_data.EdgeType, gigl.common.data.dataloaders.SerializedTFRecordInfo][source]#
 - negative_label_entity_info: gigl.common.data.dataloaders.SerializedTFRecordInfo | dict[gigl.src.common.types.graph_data.EdgeType, gigl.common.data.dataloaders.SerializedTFRecordInfo] | None = None[source]#
 - node_entity_info: gigl.common.data.dataloaders.SerializedTFRecordInfo | dict[gigl.src.common.types.graph_data.NodeType, gigl.common.data.dataloaders.SerializedTFRecordInfo][source]#
 - positive_label_entity_info: gigl.common.data.dataloaders.SerializedTFRecordInfo | dict[gigl.src.common.types.graph_data.EdgeType, gigl.common.data.dataloaders.SerializedTFRecordInfo] | None = None[source]#
 
- gigl.common.data.load_torch_tensors.load_torch_tensors_from_tf_record(tf_record_dataloader, serialized_graph_metadata, should_load_tensors_in_parallel, rank=0, node_tf_dataset_options=TFDatasetOptions(), edge_tf_dataset_options=TFDatasetOptions())[source]#
- Loads all torch tensors from a SerializedGraphMetadata object for all entity [node, edge, positive_label, negative_label] and edge / node types. - Running these processes in parallel slows the runtime of each individual process, but may still result in a net speedup across all entity types. As a result, there is a tradeoff that needs to be made between parallel and sequential tensor loading, which is why we don’t parallelize across node and edge types. We enable the should_load_tensors_in_parallel to allow some customization for loading strategies based on the input data. - Parameters:
- tf_record_dataloader (TFRecordDataLoader) – TFRecordDataloader used for loading tensors from serialized tfrecords 
- serialized_graph_metadata (SerializedGraphMetadata) – Serialized graph metadata contained serialized information for loading tfrecords across node and edge types 
- should_load_tensors_in_parallel (bool) – Whether tensors should be loaded from serialized information in parallel or in sequence across the [node, edge, pos_label, neg_label] entity types. 
- rank (int) – Rank on current machine 
- node_tf_dataset_options (TFDatasetOptions) – The options to use for nodes when building the dataset. 
- edge_tf_dataset_options (TFDatasetOptions) – The options to use for edges when building the dataset. 
 
- Returns:
- Unpartitioned Graph Tensors 
- Return type:
- loaded_graph_tensors (LoadedGraphTensors) 
 
