gigl.orchestration.kubeflow.runner#
This script is used to run a Kubeflow pipeline on VAI. You have options to RUN a pipeline, COMPILE a pipeline, or RUN a pipeline without compiling it i.e. you have a precompiled pipeline somewhere.
- RUNNING A PIPELINE:
python gigl.orchestration.kubeflow.runner –action=run …args The following arguments are required:
–task_config_uri: GCS URI to template_or_frozen_config_uri. –resource_config_uri: GCS URI to resource_config_uri. –container_image_cuda: GiGL source code image compiled for use with cuda. See containers/Dockerfile.src –container_image_cpu: GiGL source code image compiled for use with cpu. See containers/Dockerfile.src –container_image_dataflow: GiGL source code image compiled for use with dataflow. See containers/Dockerfile.dataflow.src
- The folowing arguments are optional:
–job_name: The name to give to the KFP job. Default is “gigl_run_at_<current_time>” –start_at: The component to start the pipeline at. Default is config_populator. See gigl.src.common.constants.components.GiGLComponents –stop_after: The component to stop the pipeline at. Default is None. –pipeline_tag: Optional tag, which is provided will be used to tag the pipeline description. –compiled_pipeline_path: The path to where to store the compiled pipeline to. –wait: Wait for the pipeline run to finish. –additional_job_args: Additional job arguments for the pipeline components, by component.
The value has to be of form: “<gigl_component>.<arg_name>=<value>”. Where <gigl_component> is one of the string representations of component specified in gigl.src.common.constants.components.GiGLComponents This argument can be repeated. Example: –additional_job_args=subgraph_sampler.additional_spark35_jar_file_uris=’gs://path/to/jar’ –additional_job_args=split_generator.some_other_arg=’value’ This passes additional_spark35_jar_file_uris=”gs://path/to/jar” to subgraph_sampler at compile time and some_other_arg=”value” to split_generator at compile time.
You can alternatively run_no_compile if you have a precompiled pipeline somewhere. python gigl.orchestration.kubeflow.runner –action=run_no_compile …args The following arguments are required:
–task_config_uri –resource_config_uri –compiled_pipeline_path: The path to a pre-compiled pipeline; can be gcs URI (gs://…), or a local path
- The following arguments are optional:
–job_name –start_at –stop_after –pipeline_tag –wait
- COMPILING A PIPELINE:
A strict subset of running a pipeline, python gigl.orchestration.kubeflow.runner –action=compile …args The following arguments are required:
–container_image_cuda –container_image_cpu –container_image_dataflow
- The following arguments are optional:
–compiled_pipeline_path: The path to where to store the compiled pipeline to. –pipeline_tag: Optional tag, which is provided will be used to tag the pipeline description. –additional_job_args: Additional job arguments for the pipeline components, by component.
The value has to be of form: “<gigl_component>.<arg_name>=<value>”. Where <gigl_component> is one of the string representations of component specified in gigl.src.common.constants.components.GiGLComponents This argument can be repeated. Example: –additional_job_args=subgraph_sampler.additional_spark35_jar_file_uris=’gs://path/to/jar’ –additional_job_args=split_generator.some_other_arg=’value’ This passes additional_spark35_jar_file_uris=”gs://path/to/jar” to subgraph_sampler at compile time and some_other_arg=”value” to split_generator at compile time.
Attributes#
Classes#
Generic enumeration. |