Orchestration#

GiGL is designed to support easy end to end orchestration of your GNN tasks/workflows with minimal setup required. This page outlines three ways to orchestrate GiGL for after you have set up your configs (See: quick start if you have not done so).

Local Runner#

The local runner provides a simple interface to kick off an end to end GiGL pipeline.

Create a gigl.orchestration.local.runner.PipelineConfig.

@dataclass
class PipelineConfig:
    """
    Configuration for the GiGL pipeline.

    Args:
        applied_task_identifier (AppliedTaskIdentifier): your job name
        task_config_uri (Uri): URI to your template task config
        resource_config_uri (Uri): URI to your resource config
        custom_cuda_docker_uri (Optional[str]): For custom training spec and GPU training on VertexAI
        custom_cpu_docker_uri (Optional[str]): For custom training spec and CPU training on VertexAI
        dataflow_docker_uri (Optional[str]): For custom datapreprocessor spec that will run in dataflow
    """

    applied_task_identifier: AppliedTaskIdentifier
    task_config_uri: Uri
    resource_config_uri: Uri
    custom_cuda_docker_uri: Optional[str] = None
    custom_cpu_docker_uri: Optional[str] = None
    dataflow_docker_uri: Optional[str] = DEFAULT_GIGL_RELEASE_SRC_IMAGE_DATAFLOW_CPU

Example:

from gigl.orchestration.local.runner import PipelineConfig
from gigl.src.common.types import AppliedTaskIdentifier
from gigl.common import UriFactory, Uri

PipelineConfig(
    applied_task_identifier=AppliedTaskIdentifier("demo-gigl-job"),
    template_task_config_uri=UriFactory.create_uri("gs://project/my_task_config.yaml"),
    resource_config_uri=UriFactory.create_uri("gs://project/my_resource_config.yaml")
)

Initialize and Run

Now you can create the GiGL runner object and kick off a pipeline.

Example:

runner = Runner.run(pipeline_config=pipeline_config)

Optional: The runner also supports running individual components as needed

Example:

Runner.run_data_preprocessor(pipeline_config=pipeline_config)

VertexAI (Kubeflow) Orchestration#

TODO: (svij) - This section will be updated soon.

GiGL also supports orchestration of your workflows using Kubeflow Pipelines using a KfpOrchestrator class. We make use of Vertex AI to run the Kubeflow Pipelines.

Usage Example#

from gigl.orchestration.kubeflow.runner import KfpOrchestrator
from gigl.common import UriFactory
from gigl.src.common.types import AppliedTaskIdentifier

orchestrator = KfpOrchestrator()

task_config_uri = UriFactory.create_uri("gs://path/to/task_config.yaml")
resource_config_uri = UriFactory.create_uri("gs://path/to/resource_config.yaml")
applied_task_identifier = AppliedTaskIdentifier("kfp_demo")

orchestrator.run(
    applied_task_identifier=applied_task_identifier,
    task_config_uri=task_config_uri,
    resource_config_uri=resource_config_uri,
    start_at="config_populator"
)

Importable GiGL#

You may want to integrate gigl into your existing workflows or create custom orchestration logic. This can be done by importing the components and using each of their .run() components.

Trainer Component Example:#

from gigl.src.training.trainer import Trainer

trainer = Trainer()
trainer.run(
    applied_task_identifier=job_name,
    task_config_uri="gs://...",
    resource_config_uri=="gs://...",
)

For component specific parameters/information, see Components