GiGL Architecture#

GiGL now has two execution models:

  • The current, recommended path uses in-memory subgraph sampling, where the graph is loaded into memory and sampled live during training and inference. If you are looking for the older tabularized pipeline, see Deprecated tabularized docs.

  • The older tabularized path materializes sampled subgraphs ahead of time through Subgraph Sampler and Split Generator. NOTE: The tabularized version of GiGL will be removed in a future release.

This page focuses on the current in-memory subgraph sampling architecture and points to the legacy docs separately.

Primary Pipeline Flow#

The primary GiGL flow is:

Config Populator -> Data Preprocessor -> Trainer? -> Inferencer -> Post Processor

Trainer is optional. Inference-only pipelines skip training and run inference against a graph using a pre-trained model.

For the shared runtime behavior behind the current path, see In-Memory Subgraph Sampling.

Components#

../../../_images/config_populator_icon.png

Config Populator: Freezes the template task config into a runnable GbmlConfig.

../../../_images/data_preprocessor_icon.png

Data Preprocessor: Builds graph metadata, transforms features, and enumerates node IDs into compact integer IDs.

../../../_images/trainer_icon.png

Trainer: Launches either legacy training or in-memory subgraph sampling training.

../../../_images/inferencer_icon.png

Inferencer: Launches either legacy inference or in-memory subgraph sampling inference.

Post Processor: Restores original node IDs for outputs produced by in-memory subgraph sampling and runs optional user-defined post-processing logic.

Component Diagram#

Below is a high-level system overview. Note that both training and inference are backed by the same in-memory graph sampling engine.

System overview

Source Entry Points#

Deprecated Tabularized Architecture#

If you are maintaining an older deployment that still relies on precomputed sampled subgraphs, see Deprecated tabularized docs.

That flow is:

Config Populator -> Data Preprocessor -> Subgraph Sampler -> Split Generator -> Trainer -> Inferencer