gigl.src.common.constants.gcs#

Functions#

`get_applied_task_perm_gcs_path`(applied_task_identifier)	Returns the GCS path for the perm assets bucket for a given gigl job.
`get_applied_task_temp_gcs_path`(applied_task_identifier)	Returns the GCS path for the temp_assets bucket for a given gigl job.
`get_applied_task_temp_regional_gcs_path`(...)	Returns the GCS path for the temp regional assets for a given gigl job.
`get_config_populator_assets_perm_gcs_path`(...)	Returns the GCS path for the config populator perm assets for a given gigl job (Used to write Frozen GBML Config).
`get_data_preprocessor_assets_perm_gcs_path`(...)	Returns the GCS path for the data preprocessor perm assets for a given gigl job.
`get_data_preprocessor_assets_temp_gcs_path`(...)	Returns the GCS path for temporary data preprocessor assets for a given gigl job.
`get_data_preprocessor_staging_gcs_path`(...)	Returns the GCS path for the staging directory of the data preprocessor assets for a given gigl job.
`get_dataflow_staging_gcs_path`(applied_task_identifier, ...)	Returns the GCS path for the staging directory used for Dataflow Jobs.
`get_dataflow_temp_gcs_path`(applied_task_identifier, ...)	Returns the GCS path for the "tmp" directory used for Dataflow Jobs.
`get_frozen_gbml_config_proto_gcs_path`(...)	Returns the GCS path for the frozen GBML config proto file.
`get_inferencer_asset_dir_gcs_path`(applied_task_identifier)	Returns the GCS path for perm assets written by the Inferencer (e.g. embeddings, predictions, etc.)
`get_inferencer_embeddings_gcs_prefix`(...)	Returns the GCS directory for embeddings output by the Inferencer.
`get_inferencer_predictions_gcs_prefix`(...)	Returns the GCS directory for predictions output by the Inferencer.
`get_post_processor_asset_dir_gcs_path`(...)	Returns the GCS path for perm assets written by the Post Processor (e.g. eval metrics, etc.)
`get_post_processor_metrics_gcs_path`(...)	Returns the GCS path for the eval metrics output by the Post Processor (post_processor_metrics.json)
`get_preprocessed_metadata_proto_gcs_path`(...)	Returns the GCS path for the generated PreprocessedMetadata yaml file.
`get_split_dataset_main_samples_gcs_file_prefix`(...)	Returns the GCS file prefix for the main samples output by Split Generator.
`get_split_dataset_output_gcs_file_prefix`(...)	Returns the GCS file prefix for the samples output by Split Generator.
`get_split_dataset_random_negatives_gcs_file_prefix`(...)	Returns the GCS file prefix for the random negative samples output by Split Generator.
`get_split_generator_assets_temp_gcs_path`(...)	Returns the temporary GCS path for Split Generator assets.
`get_subgraph_sampler_flattened_graph_metadata_output_gcs_path`(...)	Returns the GCS path for the flattened graph metadata yaml output by Subgraph Sampler.
`get_subgraph_sampler_node_anchor_based_link_prediction_random_negatives_samples_prefix`(...)	Returns the GCS file prefix for random negative samples output by Subgraph Sampler for node anchor based link prediction.
`get_subgraph_sampler_node_anchor_based_link_prediction_samples_prefix`(...)	Returns the GCS file prefix for samples output by Subgraph Sampler for node anchor based link prediction.
`get_subgraph_sampler_node_anchor_based_link_prediction_task_dir`(...)	Returns the GCS path which Subgraph Sampler uses to store temp assets for node anchor based link prediction.
`get_subgraph_sampler_node_neighborhood_samples_dir`(...)	Returns the GCS path which Subgraph Sampler uses to store node neighborhood samples.
`get_subgraph_sampler_node_neighborhood_samples_path_prefix`(...)	Returns the GCS file prefix for node neighborhood samples output by Subgraph Sampler.
`get_subgraph_sampler_root_dir`(applied_task_identifier)	Returns the GCS path which Subgraph Sampler uses to store temp assets.
`get_subgraph_sampler_supervised_link_based_task_dir`(...)	Returns the GCS path which Subgraph Sampler uses to store temp assets for supervised link based tasks.
`get_subgraph_sampler_supervised_link_based_task_labeled_samples_prefix`(...)	Returns the GCS file prefix for labeled samples output by Subgraph Sampler for supervised link based tasks.
`get_subgraph_sampler_supervised_link_based_task_unlabeled_samples_prefix`(...)	Returns the GCS file prefix for unlabeled samples output by Subgraph Sampler for supervised link based tasks.
`get_subgraph_sampler_supervised_node_classification_labeled_samples_prefix`(...)	Returns the GCS file prefix for labeled samples output by Subgraph Sampler for supervised node classification.
`get_subgraph_sampler_supervised_node_classification_task_dir`(...)	Returns the GCS path which Subgraph Sampler uses to store temp assets for supervised node classification.
`get_subgraph_sampler_supervised_node_classification_unlabeled_samples_prefix`(...)	Returns the GCS file prefix for unlabeled samples output by Subgraph Sampler for supervised node classification.
`get_tensorboard_logs_gcs_path`(applied_task_identifier)	Returns the GCS path that is used to store tensorboard logs.
`get_tf_transform_directory_path`(...[, custom_identifier])	Returns the GCS path used by Data Preprocessor for TensorFlow Transform (TFT) assets.
`get_tf_transform_raw_data_schema_file_path`(...[, ...])	Returns the GCS path for the raw data schema file used in TensorFlow Transform.
`get_tf_transform_stats_directory_path`(...[, ...])	Returns the GCS path for the "stats" directory used by Data Preprocessor for TensorFlow Transform (TFT) assets.
`get_tf_transform_stats_file_path`(...[, custom_identifier])	Returns the GCS path for the TensorFlow transform stats file.
`get_tf_transform_temp_directory_path`(...[, ...])	Returns the GCS path for the "tft_temp_dir" used by Data Preprocessor for TensorFlow Transform (TFT) temp assets.
`get_tf_transform_visualized_facets_file_path`(...[, ...])	Returns the GCS path for the visualized facets overview HTML file.
`get_tf_transformed_features_schema_path`(...[, ...])	Returns the GCS path for the schema.pbtxt file of the transformed features (Data Preprocessor)
`get_tf_transformed_features_transform_fn_assets_directory_path`(...)	Returns the directory path for the assets of the transformed features' transform_fn (Data Preprocessor).
`get_trained_model_eval_metrics_gcs_path`(...)	Returns the GCS path for the eval metrics output by the Trainer (eval_metrics.json)
`get_trained_model_gcs_path`(applied_task_identifier)	Returns the GCS path for the trained model output by the Trainer (model.pt)
`get_trained_model_metadata_proto_gcs_path`(...)	Returns the trained model metadata yaml file outputted by the Trainer.
`get_trained_models_dir_gcs_path`(applied_task_identifier)	Returns the GCS path for the trained models directory.
`get_trained_scripted_model_gcs_path`(...)	Returns the GCS path for the scripted model output by the Trainer (scripted_model.pt)
`get_trainer_asset_dir_gcs_path`(applied_task_identifier)	Returns the GCS path for perm assets written by the Trainer (e.g. trained models, eval metrics, etc.)
`get_transformed_features_directory_path`(...[, ...])	Returns the GCS path for the directory where transformed features are written by Data Preprocessor.
`get_transformed_features_file_prefix`(...[, ...])	Returns the GCS file prefix for transformed features.

Module Contents#

gigl.src.common.constants.gcs.get_applied_task_perm_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the perm assets bucket for a given gigl job.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the perm assets bucket.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_applied_task_temp_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the temp_assets bucket for a given gigl job.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the temp assets bucket.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_applied_task_temp_regional_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the temp regional assets for a given gigl job.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the temp regional assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_config_populator_assets_perm_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the config populator perm assets for a given gigl job (Used to write Frozen GBML Config).

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the config populator perm assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_data_preprocessor_assets_perm_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the data preprocessor perm assets for a given gigl job.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the data preprocessor perm assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_data_preprocessor_assets_temp_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for temporary data preprocessor assets for a given gigl job.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for temporary data preprocessor assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_data_preprocessor_staging_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the staging directory of the data preprocessor assets for a given gigl job.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the staging directory of the data preprocessor assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_dataflow_staging_gcs_path(applied_task_identifier, job_name)[source]#

Returns the GCS path for the staging directory used for Dataflow Jobs.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
job_name (str) – The name of the Dataflow job.

Returns:

The GCS path for the staging directory used for Dataflow Jobs.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_dataflow_temp_gcs_path(applied_task_identifier, job_name)[source]#

Returns the GCS path for the “tmp” directory used for Dataflow Jobs.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
job_name (str) – The name of the Dataflow job.

Returns:

The GCS path for the “tmp” directory used for Dataflow Jobs.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_frozen_gbml_config_proto_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the frozen GBML config proto file. See: proto/snapchat/research/gbml/gbml_config.proto for more details.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the frozen GBML config proto file.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_inferencer_asset_dir_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for perm assets written by the Inferencer (e.g. embeddings, predictions, etc.)

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for perm assets written by the Inferencer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_inferencer_embeddings_gcs_prefix(applied_task_identifier)[source]#

Returns the GCS directory for embeddings output by the Inferencer.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS directory for embeddings output by the Inferencer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_inferencer_predictions_gcs_prefix(applied_task_identifier)[source]#

Returns the GCS directory for predictions output by the Inferencer.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS directory for predictions output by the Inferencer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_post_processor_asset_dir_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for perm assets written by the Post Processor (e.g. eval metrics, etc.) :param applied_task_identifier: The job name. :type applied_task_identifier: AppliedTaskIdentifier

Returns:: The GCS path for perm assets written by the Post Processor.
Return type:: GcsUri
Parameters:: applied_task_identifier (gigl.src.common.types.AppliedTaskIdentifier)

gigl.src.common.constants.gcs.get_post_processor_metrics_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the eval metrics output by the Post Processor (post_processor_metrics.json) :param applied_task_identifier: The job name. :type applied_task_identifier: AppliedTaskIdentifier

Returns:: The GCS path for the eval metrics output by the Post Processor.
Return type:: GcsUri
Parameters:: applied_task_identifier (gigl.src.common.types.AppliedTaskIdentifier)

gigl.src.common.constants.gcs.get_preprocessed_metadata_proto_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the generated PreprocessedMetadata yaml file.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the preprocessed metadata proto file.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_split_dataset_main_samples_gcs_file_prefix(applied_task_identifier, dataset_split)[source]#

Returns the GCS file prefix for the main samples output by Split Generator.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
dataset_split (DatasetSplit) – The dataset split.

Returns:

The GCS file prefix for the main samples output by Split Generator.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_split_dataset_output_gcs_file_prefix(applied_task_identifier, dataset_split)[source]#

Returns the GCS file prefix for the samples output by Split Generator.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
dataset_split (DatasetSplit) – The dataset split.

Returns:

The GCS file prefix for the samples output by Split Generator.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_split_dataset_random_negatives_gcs_file_prefix(applied_task_identifier, node_type, dataset_split)[source]#

Returns the GCS file prefix for the random negative samples output by Split Generator.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
dataset_split (DatasetSplit) – The dataset split.
node_type (gigl.src.common.types.graph_data.NodeType)

Returns:

The GCS file prefix for the random negative samples output by Split Generator.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_split_generator_assets_temp_gcs_path(applied_task_identifier)[source]#

Returns the temporary GCS path for Split Generator assets.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The temporary GCS path for Split Generator assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_flattened_graph_metadata_output_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the flattened graph metadata yaml output by Subgraph Sampler. See: proto/snapchat/research/gbml/flattened_graph_metadata.proto for more details.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the flattened graph metadata yaml output by Subgraph Sampler.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_node_anchor_based_link_prediction_random_negatives_samples_prefix(applied_task_identifier, node_type)[source]#

Returns the GCS file prefix for random negative samples output by Subgraph Sampler for node anchor based link prediction.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
node_type (gigl.src.common.types.graph_data.NodeType)

Returns:

The GCS file prefix for random negative samples output by Subgraph Sampler for node anchor based link prediction.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_node_anchor_based_link_prediction_samples_prefix(applied_task_identifier)[source]#

Returns the GCS file prefix for samples output by Subgraph Sampler for node anchor based link prediction.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS file prefix for samples output by Subgraph Sampler for node anchor based link prediction.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_node_anchor_based_link_prediction_task_dir(applied_task_identifier)[source]#

Returns the GCS path which Subgraph Sampler uses to store temp assets for node anchor based link prediction.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path which Subgraph Sampler uses to store temp assets for node anchor based link prediction.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_node_neighborhood_samples_dir(applied_task_identifier)[source]#

Returns the GCS path which Subgraph Sampler uses to store node neighborhood samples.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path which Subgraph Sampler uses to store node neighborhood samples.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_node_neighborhood_samples_path_prefix(applied_task_identifier)[source]#

Returns the GCS file prefix for node neighborhood samples output by Subgraph Sampler.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS file prefix for node neighborhood samples output by Subgraph Sampler.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_root_dir(applied_task_identifier)[source]#

Returns the GCS path which Subgraph Sampler uses to store temp assets.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path which Subgraph Sampler uses to store temp assets.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_supervised_link_based_task_dir(applied_task_identifier)[source]#

Returns the GCS path which Subgraph Sampler uses to store temp assets for supervised link based tasks.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path which Subgraph Sampler uses to store temp assets for supervised link based tasks.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_supervised_link_based_task_labeled_samples_prefix(applied_task_identifier)[source]#

Returns the GCS file prefix for labeled samples output by Subgraph Sampler for supervised link based tasks.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS file prefix for labeled samples output by Subgraph Sampler for supervised link based tasks.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_supervised_link_based_task_unlabeled_samples_prefix(applied_task_identifier)[source]#

Returns the GCS file prefix for unlabeled samples output by Subgraph Sampler for supervised link based tasks.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS file prefix for unlabeled samples output by Subgraph Sampler for supervised link based tasks.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_supervised_node_classification_labeled_samples_prefix(applied_task_identifier)[source]#

Returns the GCS file prefix for labeled samples output by Subgraph Sampler for supervised node classification.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS file prefix for labeled samples output by Subgraph Sampler for supervised node classification.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_supervised_node_classification_task_dir(applied_task_identifier)[source]#

Returns the GCS path which Subgraph Sampler uses to store temp assets for supervised node classification.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path which Subgraph Sampler uses to store temp assets for supervised node classification.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_subgraph_sampler_supervised_node_classification_unlabeled_samples_prefix(applied_task_identifier)[source]#

Returns the GCS file prefix for unlabeled samples output by Subgraph Sampler for supervised node classification.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS file prefix for unlabeled samples output by Subgraph Sampler for supervised node classification.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_tensorboard_logs_gcs_path(applied_task_identifier)[source]#

Returns the GCS path that is used to store tensorboard logs.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path that is used to store tensorboard logs.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_tf_transform_directory_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path used by Data Preprocessor for TensorFlow Transform (TFT) assets.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of feature.
entity_type (Union[NodeType, EdgeType]) – The type of entity (node or edge).
custom_identifier (Optional[str]) – Custom identifier for the directory path. Defaults to “”.

Returns:

The GCS path for the directory used by Data Preprocessor for TensorFlow Transform (TFT) assets.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transform_raw_data_schema_file_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the raw data schema file used in TensorFlow Transform.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of feature.
entity_type (Union[NodeType, EdgeType]) – The type of entity (node or edge).
custom_identifier (Optional[str]) – Custom identifier for the file path. Defaults to “”.

Returns:

The GCS path for the raw data schema file used in TensorFlow Transform.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transform_stats_directory_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the “stats” directory used by Data Preprocessor for TensorFlow Transform (TFT) assets.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of feature.
entity_type (Union[NodeType, EdgeType]) – The type of entity (node or edge).
custom_identifier (Optional[str]) – Custom identifier for the directory path. Defaults to “”.

Returns:

The GCS path for the “stats” directory used by Data Preprocessor for TensorFlow Transform (TFT) assets.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transform_stats_file_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the TensorFlow transform stats file.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The feature type.
entity_type (Union[NodeType, EdgeType]) – The entity type.
custom_identifier (Optional[str]) – The custom identifier. Defaults to “”.

Returns:

The GCS path for the TensorFlow transform stats file.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transform_temp_directory_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the “tft_temp_dir” used by Data Preprocessor for TensorFlow Transform (TFT) temp assets.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of feature.
entity_type (Union[NodeType, EdgeType]) – The type of entity.
custom_identifier (Optional[str]) – Custom identifier for the directory path. Defaults to “”.

Returns:

The GCS path for the “tft_temp_dir” used by Data Preprocessor for TensorFlow Transform (TFT) temp assets.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transform_visualized_facets_file_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the visualized facets overview HTML file.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of feature.
entity_type (Union[NodeType, EdgeType]) – The type of entity (node or edge).
custom_identifier (Optional[str]) – Custom identifier for the file path. Defaults to “”.

Returns:

The GCS path for the visualized facets overview HTML file.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transformed_features_schema_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the schema.pbtxt file of the transformed features (Data Preprocessor)

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of the features.
entity_type (Union[NodeType, EdgeType]) – The type of the entity (node or edge).
custom_identifier (Optional[str]) – Custom identifier for the GCS path. Defaults to “”.

Returns:

The GCS path for the schema.pbtxt file of the transformed features.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_tf_transformed_features_transform_fn_assets_directory_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the directory path for the assets of the transformed features’ transform_fn (Data Preprocessor).

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of the feature.
entity_type (Union[NodeType, EdgeType]) – The type of the entity.
custom_identifier (Optional[str]) – Custom identifier for the directory path. Defaults to “”.

Returns:

The GCS path for the directory of the assets of the transformed features’ transform_fn.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_trained_model_eval_metrics_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the eval metrics output by the Trainer (eval_metrics.json)

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the eval metrics output by the Trainer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_trained_model_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the trained model output by the Trainer (model.pt)

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the trained model output by the Trainer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_trained_model_metadata_proto_gcs_path(applied_task_identifier)[source]#

Returns the trained model metadata yaml file outputted by the Trainer. See: proto/snapchat/research/gbml/trained_model_metadata.proto for more details.

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the trained model metadata yaml file outputted by the Trainer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_trained_models_dir_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the trained models directory.

Parameters:: applied_task_identifier (gigl.src.common.types.AppliedTaskIdentifier)
Return type:: gigl.common.GcsUri

gigl.src.common.constants.gcs.get_trained_scripted_model_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for the scripted model output by the Trainer (scripted_model.pt)

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for the scripted model output by the Trainer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_trainer_asset_dir_gcs_path(applied_task_identifier)[source]#

Returns the GCS path for perm assets written by the Trainer (e.g. trained models, eval metrics, etc.)

Parameters:: applied_task_identifier (AppliedTaskIdentifier) – The job name.
Returns:: The GCS path for perm assets written by the Trainer.
Return type:: GcsUri

gigl.src.common.constants.gcs.get_transformed_features_directory_path(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS path for the directory where transformed features are written by Data Preprocessor.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of feature.
entity_type (Union[NodeType, EdgeType]) – The type of entity (node or edge).
custom_identifier (Optional[str]) – Custom identifier for the directory path. Defaults to “”.

Returns:

The GCS path for the directory of the transformed features.

Return type:

GcsUri

gigl.src.common.constants.gcs.get_transformed_features_file_prefix(applied_task_identifier, feature_type, entity_type, custom_identifier='')[source]#

Returns the GCS file prefix for transformed features.

Parameters:

applied_task_identifier (AppliedTaskIdentifier) – The job name.
feature_type (FeatureTypes) – The type of the feature.
entity_type (Union[NodeType, EdgeType]) – The type of the entity.
custom_identifier (Optional[str]) – Custom identifier for the file prefix. Defaults to “”.

Returns:

The GCS file prefix for transformed features.

Return type:

GcsUri