gigl.experimental.knowledge_graph_embedding.lib.data.edge_batch#
Classes#
| A class for representing a batch of edges in a heterogeneous graph. | 
Functions#
| 
 | Converts a list of HeterogeneousGraphEdgeDict into tensors. | 
| This is a collate function for the EdgeBatch. | |
| 
 | Performs random negative sampling for each edge type. | 
Module Contents#
- class gigl.experimental.knowledge_graph_embedding.lib.data.edge_batch.EdgeBatch[source]#
- Bases: - gigl.experimental.knowledge_graph_embedding.common.torchrec.batch.DataclassBatch- A class for representing a batch of edges in a heterogeneous graph. This can be derived from input edge tensors, and contains logic to build a torchrec KeyedJaggedTensor (used for sharded embedding lookups) and other metadata tensors which are required to train KGE models. - static build_data_loader(dataset, sampling_config, dataloader_config, graph_metadata, condensed_node_type_to_vocab_size_map, pin_memory, should_loop=True)[source]#
- Parameters:
- dataset (torch.utils.data.IterableDataset) 
- sampling_config (gigl.experimental.knowledge_graph_embedding.lib.config.sampling.SamplingConfig) 
- dataloader_config (gigl.experimental.knowledge_graph_embedding.lib.config.dataloader.DataloaderConfig) 
- graph_metadata (gigl.src.common.types.pb_wrappers.graph_metadata.GraphMetadataPbWrapper) 
- condensed_node_type_to_vocab_size_map (dict[gigl.src.common.types.graph_data.CondensedNodeType, int]) 
- pin_memory (bool) 
- should_loop (bool) 
 
 
 - static from_edge_tensors(edges, condensed_edge_types, edge_labels, condensed_node_type_to_node_type_map, condensed_edge_type_to_condensed_node_type_map)[source]#
- Creates an EdgeBatch from edge tensors. We create an EdgeBatch of len(2 * edges) by creating a src-dst pair for each edge in the batch. - Parameters:
- edges (torch.Tensor) – A tensor of edges. 
- condensed_edge_types (torch.Tensor) – A tensor of condensed edge types. 
- edge_labels (torch.Tensor) – A tensor of edge labels. 
- condensed_node_type_to_node_type_map (dict[CondensedNodeType, NodeType]) – A mapping from condensed node types to node types. 
- condensed_edge_type_to_condensed_node_type_map (dict[CondensedEdgeType, tuple[CondensedNodeType, CondensedNodeType]]) – A mapping from condensed edge types to condensed node types. 
 
- Return type:
 
 - to_edge_tensors(condensed_edge_type_to_condensed_node_type_map)[source]#
- Reconstructs the edge tensors from the EdgeBatch. This is used for debugging and sanity checking the EdgeBatch. - Parameters:
- condensed_edge_type_to_condensed_node_type_map (dict[gigl.src.common.types.graph_data.CondensedEdgeType, tuple[gigl.src.common.types.graph_data.CondensedNodeType, gigl.src.common.types.graph_data.CondensedNodeType]]) 
- Return type:
- tuple[torch.Tensor, torch.Tensor, torch.Tensor] 
 
 
- gigl.experimental.knowledge_graph_embedding.lib.data.edge_batch.build_tensors_from_edge_dicts(inputs)[source]#
- Converts a list of HeterogeneousGraphEdgeDict into tensors. - Parameters:
- inputs (list[HeterogeneousGraphEdgeDict]) – A list of edge dictionaries. 
- Returns:
- A tuple containing:
- edges (torch.Tensor): A tensor of shape [num_edges, 2] containing the source and destination node IDs. 
- condensed_edge_types (torch.Tensor): A tensor of shape [num_edges] containing the condensed edge types. 
- labels (torch.Tensor): A tensor of shape [num_edges] containing labels (all set to 1). 
 
 
- Return type:
- tuple[torch.Tensor, torch.Tensor, torch.Tensor] 
 
- gigl.experimental.knowledge_graph_embedding.lib.data.edge_batch.collate_edge_batch_from_heterogeneous_graph_edge_dict(inputs, condensed_edge_type_to_condensed_node_type_map, condensed_node_type_to_vocab_size_map, condensed_node_type_to_node_type_map, num_random_negatives_per_edge=0)[source]#
- This is a collate function for the EdgeBatch. It takes a list of heterogeneous graph edge dictionaries (read from upstream dataset), converts them to tensors for “positive” edges, samples “negative” edges if applicable, and constructs an EdgeBatch (containing a TorchRec KeyedJaggedTensor and metadata). - Parameters:
- inputs (list[HeterogeneousGraphEdgeDict]) – The input data. 
- condensed_edge_type_to_condensed_node_type_map (dict[CondensedEdgeType, tuple[CondensedNodeType, CondensedNodeType]]) – A mapping from condensed edge types to condensed node types. 
- condensed_node_type_to_vocab_size_map (dict[CondensedNodeType, int]) – A mapping from condensed node types to vocab sizes. 
- condensed_node_type_to_node_type_map (dict[CondensedNodeType, NodeType]) – A mapping from condensed node types to node types. 
- num_negative_samples_per_edge (int) – The number of negative samples to generate for each positive edge. 
- num_random_negatives_per_edge (int) 
 
- Returns:
- The collated EdgeBatch. 
- Return type:
 
- gigl.experimental.knowledge_graph_embedding.lib.data.edge_batch.relationwise_batch_random_negative_sampling(condensed_edge_type_to_condensed_node_type_map, condensed_node_type_to_vocab_size_map, num_negatives_per_condensed_edge_type=1)[source]#
- Performs random negative sampling for each edge type. - This function generates num_negatives_per_condensed_edge_type with src and dst selected at random from the vocabulary associated with the node types, as defined by the edge type and provided type-to-vocabulary maps. - These can be consumed in model training as negative samples which are shared across edges. - Parameters:
- condensed_edge_type_to_condensed_node_type_map (dict[int, tuple[int, int]]) – A mapping from each edge type to a tuple of (source_node_type, destination_node_type) [R]. 
- condensed_node_type_to_vocab_size_map (dict[int, int]) – A mapping from each node type to the size of its vocabulary. 
- num_negatives_per_condensed_edge_type (int) – The number of negative edges to sample per edge type [K]. 
 
- Returns:
- A tensor of shape [R * K] containing negative edges. negative_edge_types (Tensor): A tensor of shape [R * K] containing the edge type - for each negative edge. - negative_labels (Tensor): A tensor of zeros with shape [R * K], suitable for
- use in contrastive or classification losses. 
 
- Return type:
- negative_edges (Tensor) 
 
