Orchestration#
GiGL is designed to support easy end to end orchestration of your GNN tasks/workflows with minimal setup required. This page outlines three ways to orchestrate GiGL for after you have set up your configs (See: quick start if you have not done so).
Local Runner#
The local runner provides a simple interface to kick off an end to end GiGL pipeline.
@dataclass
class PipelineConfig:
"""
Configuration for the GiGL pipeline.
Args:
applied_task_identifier (AppliedTaskIdentifier): your job name
task_config_uri (Uri): URI to your template task config
resource_config_uri (Uri): URI to your resource config
custom_cuda_docker_uri (Optional[str]): For custom training spec and GPU training on VertexAI
custom_cpu_docker_uri (Optional[str]): For custom training spec and CPU training on VertexAI
dataflow_docker_uri (Optional[str]): For custom datapreprocessor spec that will run in dataflow
"""
applied_task_identifier: AppliedTaskIdentifier
task_config_uri: Uri
resource_config_uri: Uri
custom_cuda_docker_uri: Optional[str] = None
custom_cpu_docker_uri: Optional[str] = None
dataflow_docker_uri: Optional[str] = GIGL_DATAFLOW_IMAGE
Example:
from gigl.orchestration.local.runner import PipelineConfig
from gigl.src.common.types import AppliedTaskIdentifier
from gigl.common import UriFactory, Uri
PipelineConfig(
applied_task_identifier=AppliedTaskIdentifier("demo-gigl-job"),
template_task_config_uri=UriFactory.create_uri("gs://project/my_task_config.yaml"),
resource_config_uri=UriFactory.create_uri("gs://project/my_resource_config.yaml")
)
Initialize and Run
Now you can create the GiGL runner object and kick off a pipeline.
Example:
runner = Runner.run(pipeline_config=pipeline_config)
Optional: The runner also supports running individual components as needed
Example:
Runner.run_data_preprocessor(pipeline_config=pipeline_config)
VertexAI (Kubeflow) Orchestration#
TODO: (svij) - This section will be updated soon.
GiGL also supports orchestration of your workflows using
Kubeflow Pipelines using a KfpOrchestrator
class. We make
use of Vertex AI to run the Kubeflow Pipelines.
Usage Example#
from gigl.orchestration.kubeflow.runner import KfpOrchestrator
from gigl.common import UriFactory
from gigl.src.common.types import AppliedTaskIdentifier
orchestrator = KfpOrchestrator()
task_config_uri = UriFactory.create_uri("gs://path/to/task_config.yaml")
resource_config_uri = UriFactory.create_uri("gs://path/to/resource_config.yaml")
applied_task_identifier = AppliedTaskIdentifier("kfp_demo")
orchestrator.run(
applied_task_identifier=applied_task_identifier,
task_config_uri=task_config_uri,
resource_config_uri=resource_config_uri,
start_at="config_populator"
)
Importable GiGL#
You may want to integrate gigl into your existing workflows or create custom orchestration logic. This can be done by
importing the components and using each of their .run()
components.
Trainer Component Example:#
from gigl.src.training.trainer import Trainer
trainer = Trainer()
trainer.run(
applied_task_identifier=job_name,
task_config_uri="gs://...",
resource_config_uri=="gs://...",
)
For component specific parameters/information, see Components