gigl.orchestration.kubeflow.runner#

This script is used to run a Kubeflow pipeline on VAI. You have options to RUN a pipeline, COMPILE a pipeline, or RUN a pipeline without compiling it i.e. you have a precompiled pipeline somewhere.

RUNNING A PIPELINE:

python gigl.orchestration.kubeflow.runner –action=run …args The following arguments are required:

–task_config_uri: GCS URI to template_or_frozen_config_uri. –resource_config_uri: GCS URI to resource_config_uri. –container_image_cuda: GiGL source code image compiled for use with cuda. See containers/Dockerfile.src –container_image_cpu: GiGL source code image compiled for use with cpu. See containers/Dockerfile.src –container_image_dataflow: GiGL source code image compiled for use with dataflow. See containers/Dockerfile.dataflow.src

The folowing arguments are optional:: –job_name: The name to give to the KFP job. Default is “gigl_run_at_<current_time>” –start_at: The component to start the pipeline at. Default is config_populator. See gigl.src.common.constants.components.GiGLComponents –stop_after: The component to stop the pipeline at. Default is None. –pipeline_tag: Optional tag, which is provided will be used to tag the pipeline description. –compiled_pipeline_path: The path to where to store the compiled pipeline to. –wait: Wait for the pipeline run to finish. –additional_job_args: Additional job arguments for the pipeline components, by component.

The value has to be of form: “<gigl_component>.<arg_name>=<value>”. Where <gigl_component> is one of the string representations of component specified in gigl.src.common.constants.components.GiGLComponents This argument can be repeated. Example: –additional_job_args=subgraph_sampler.additional_spark35_jar_file_uris=’gs://path/to/jar’ –additional_job_args=split_generator.some_other_arg=’value’ This passes additional_spark35_jar_file_uris=”gs://path/to/jar” to subgraph_sampler at compile time and some_other_arg=”value” to split_generator at compile time.

You can alternatively run_no_compile if you have a precompiled pipeline somewhere. python gigl.orchestration.kubeflow.runner –action=run_no_compile …args The following arguments are required:

–task_config_uri –resource_config_uri –compiled_pipeline_path: The path to a pre-compiled pipeline; can be gcs URI (gs://…), or a local path

The following arguments are optional:: –job_name –start_at –stop_after –pipeline_tag –wait

COMPILING A PIPELINE:

A strict subset of running a pipeline, python gigl.orchestration.kubeflow.runner –action=compile …args The following arguments are required:

–container_image_cuda –container_image_cpu –container_image_dataflow

The following arguments are optional:: –compiled_pipeline_path: The path to where to store the compiled pipeline to. –pipeline_tag: Optional tag, which is provided will be used to tag the pipeline description. –additional_job_args: Additional job arguments for the pipeline components, by component.

The value has to be of form: “<gigl_component>.<arg_name>=<value>”. Where <gigl_component> is one of the string representations of component specified in gigl.src.common.constants.components.GiGLComponents This argument can be repeated. Example: –additional_job_args=subgraph_sampler.additional_spark35_jar_file_uris=’gs://path/to/jar’ –additional_job_args=split_generator.some_other_arg=’value’ This passes additional_spark35_jar_file_uris=”gs://path/to/jar” to subgraph_sampler at compile time and some_other_arg=”value” to split_generator at compile time.

Attributes#

`DEFAULT_JOB_NAME`
`DEFAULT_START_AT`
`logger`
`parser`

Classes#

Action

Generic enumeration.

Module Contents#

class gigl.orchestration.kubeflow.runner.Action[source]#

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

static from_string(s)[source]#

Parameters:: s (str)
Return type:: Action

COMPILE = 'compile'[source]#

RUN = 'run'[source]#

RUN_NO_COMPILE = 'run_no_compile'[source]#

gigl.orchestration.kubeflow.runner.DEFAULT_JOB_NAME = 'gigl_run_at_Uninferable'[source]#

gigl.orchestration.kubeflow.runner.DEFAULT_START_AT = 'config_populator'[source]#

gigl.orchestration.kubeflow.runner.logger[source]#

gigl.orchestration.kubeflow.runner.parser[source]#