gigl.experimental.knowledge_graph_embedding.lib.config.graph#

Classes#

EnumeratedGraphData

Configuration for enumerated graph data after preprocessing.

GraphConfig

Main graph configuration containing metadata and data references.

RawGraphData

Configuration for raw graph data references from BigQuery sources.

Module Contents#

class gigl.experimental.knowledge_graph_embedding.lib.config.graph.EnumeratedGraphData[source]#

Configuration for enumerated graph data after preprocessing.

Enumerated graph data refers to preprocessed node and edge data where node and edge identifiers have been mapped to integer IDs, making them suitable for embedding lookups into tables during model training.

node_data[source]#

List of metadata for enumerated node types, containing mapping information from raw node IDs to integer IDs.

Type:

List[EnumeratorNodeTypeMetadata]

edge_data[source]#

List of metadata for enumerated edge types, containing mapping information from raw node ID-based edges to corresponding integer ID-based edges.

Type:

List[EnumeratorEdgeTypeMetadata]

generate_hydra_config_yaml()[source]#

Generate a Hydra-compatible YAML configuration string for enumerated graph data.

Converts the enumerated node and edge data into a YAML format that can be used by Hydra for configuration management. Dynamically inserts ‘_target_’ fields based on object types, handling dataclasses and namedtuples.

Returns:

A YAML-formatted string containing the Hydra configuration for

enumerated graph data with proper ‘_target_’ fields for instantiation.

Return type:

str

to_dictconfig()[source]#

Convert enumerated graph data to an OmegaConf DictConfig object.

Creates a Hydra-compatible configuration object from the enumerated node and edge data. This is useful for programmatic configuration management without writing to files. Dynamically inserts ‘_target_’ fields based on object types.

Returns:

An OmegaConf DictConfig object containing the enumerated graph

data configuration with proper ‘_target_’ fields for Hydra instantiation.

Return type:

DictConfig

edge_data: List[gigl.src.data_preprocessor.lib.enumerate.utils.EnumeratorEdgeTypeMetadata][source]#
node_data: List[gigl.src.data_preprocessor.lib.enumerate.utils.EnumeratorNodeTypeMetadata][source]#
class gigl.experimental.knowledge_graph_embedding.lib.config.graph.GraphConfig[source]#

Main graph configuration containing metadata and data references.

This configuration encapsulates all information about the knowledge graph structure, including schema metadata and references to both raw and processed data sources.

metadata[source]#

Graph metadata wrapper containing schema information (node types, edge types, feature schemas) wrapped in a protocol buffer format.

Type:

GraphMetadataPbWrapper

raw_graph_data[source]#

Optional reference to raw BigQuery data sources. Used during initial data ingestion and preprocessing. None if not applicable.

Type:

Optional[RawGraphData]

enumerated_graph_data[source]#

Optional reference to preprocessed enumerated data. Used during model training when data has been preprocessed into integer IDs. None if not applicable.

Type:

Optional[EnumeratedGraphData]

enumerated_graph_data: EnumeratedGraphData | None = None[source]#
metadata: gigl.src.common.types.pb_wrappers.graph_metadata.GraphMetadataPbWrapper[source]#
raw_graph_data: RawGraphData | None = None[source]#
class gigl.experimental.knowledge_graph_embedding.lib.config.graph.RawGraphData[source]#

Configuration for raw graph data references from BigQuery sources.

Raw graph data refers to the original, unprocessed node and edge data stored in BigQuery tables before enumeration and preprocessing for model training.

node_data[source]#

List of BigQuery data references for node data tables. Each reference specifies the location and schema of node information.

Type:

List[BigqueryNodeDataReference]

edge_data[source]#

List of BigQuery data references for edge data tables. Each reference specifies the location and schema of relationship information.

Type:

List[BigqueryEdgeDataReference]

edge_data: List[gigl.src.data_preprocessor.lib.ingest.bigquery.BigqueryEdgeDataReference][source]#
node_data: List[gigl.src.data_preprocessor.lib.ingest.bigquery.BigqueryNodeDataReference][source]#