gigl.src.inference.v1.lib.utils#
Attributes#
Classes#
| A transform object used to modify one or more PCollections. | 
Functions#
| Gets the beam pipeline for running the inference dataflow job | 
Module Contents#
- class gigl.src.inference.v1.lib.utils.UnenumerateAssets(tagged_output_key)[source]#
- Bases: - apache_beam.PTransform- A transform object used to modify one or more PCollections. - Subclasses must define an expand() method that will be used when the transform is applied to some arguments. Typical usage pattern will be: - input | CustomTransform(…) - The expand() method of the CustomTransform object passed in will be called with input as an argument. - Parameters:
- tagged_output_key (str) 
 - expand(pcolls)[source]#
- Performs unenumeration on two PCollections through a join between the two collections. The first PCollection should contain the DEFAULT_NODE_ID_FIELD and either DEFAULT_PREDICTION_FIELD or DEFAULT_EMBEDDING_FIELD columns. The second PCollection should contain the DEFAULT_ENUMERATED_NODE_ID_FIELD and DEFAULT_ORIGINAL_NODE_ID_FIELD columns. The two pcollections will be joined by the values in the DEFAULT_NODE_ID_FIELD and DEFAULT_ENUMERATED_NODE_ID_FIELD columns. - Parameters:
- pcolls (Tuple[apache_beam.pvalue.PCollection, apache_beam.pvalue.PCollection]) 
- Return type:
- apache_beam.pvalue.PCollection 
 
 
- gigl.src.inference.v1.lib.utils.get_inferencer_pipeline_component_for_single_node_type(gbml_config_pb_wrapper, inference_blueprint, applied_task_identifier, custom_worker_image_uri, node_type, uri_prefix_list, temp_predictions_gcs_path, temp_embeddings_gcs_path)[source]#
- Gets the beam pipeline for running the inference dataflow job :param gbml_config_pb_wrapper: GBML config wrapper for this inference run :type gbml_config_pb_wrapper: GbmlConfigPbWrapper :param inference_blueprint: Blueprint for running and saving inference for GBML pipelines :type inference_blueprint: BaseInferenceBlueprint :param applied_task_identifier: Identifier for the GiGL job :type applied_task_identifier: AppliedTaskIdentifier :param custom_worker_image_uri: Uri to custom worker image :type custom_worker_image_uri: Optional[str] :param node_type: Node type being inferred :type node_type: NodeType :param uri_prefix_list: List of prefixes for running inference for given node type :type uri_prefix_list: list[Uri] :param temp_predictions_gcs_path: Gcs uri for writing temp predictions :type temp_predictions_gcs_path: Optional[GcsUri] :param temp_embeddings_gcs_path: Gcs uri for writing temp embeddings :type temp_embeddings_gcs_path: Optional[GcsUri] - Returns:
- Dataflow pipeline for running inference 
- Return type:
- pipeline (beam.Pipeline) 
- Parameters:
- gbml_config_pb_wrapper (gigl.src.common.types.pb_wrappers.gbml_config.GbmlConfigPbWrapper) 
- inference_blueprint (gigl.src.inference.v1.lib.base_inference_blueprint.BaseInferenceBlueprint) 
- applied_task_identifier (gigl.src.common.types.AppliedTaskIdentifier) 
- custom_worker_image_uri (Optional[str]) 
- node_type (gigl.src.common.types.graph_data.NodeType) 
- uri_prefix_list (list[gigl.common.Uri]) 
- temp_predictions_gcs_path (Optional[gigl.common.GcsUri]) 
- temp_embeddings_gcs_path (Optional[gigl.common.GcsUri]) 
 
 
