gigl.distributed.sampler_options#
Sampler option types for configuring which sampler class to use in distributed loading.
Provides KHopNeighborSamplerOptions for using GiGL’s built-in DistNeighborSampler,
and PPRSamplerOptions for PPR-based sampling using DistPPRNeighborSampler.
Frozen dataclasses so they are safe to pickle across RPC boundaries (required for Graph Store mode).
Attributes#
Classes#
Default sampler options using GiGL's DistNeighborSampler. |
|
Sampler options for PPR-based neighbor sampling using DistPPRNeighborSampler. |
Functions#
|
Resolve sampler_options from user-provided values. |
Module Contents#
- class gigl.distributed.sampler_options.KHopNeighborSamplerOptions[source]#
Default sampler options using GiGL’s DistNeighborSampler.
- class gigl.distributed.sampler_options.PPRSamplerOptions[source]#
Sampler options for PPR-based neighbor sampling using DistPPRNeighborSampler.
Output format: When this sampler is active, each output Data/HeteroData batch contains only PPR edges — no message-passing edges from the original graph are included. For each
(seed_type, neighbor_type)pair reachable via PPR walks, the batch will have an edge type(seed_type, "ppr", neighbor_type)with:edge_index:[2, N]int64 — row 0 is local seed indices, row 1 is local neighbor indices.edge_attr:[N]float — PPR score for each (seed, neighbor) pair.
For homogeneous graphs these live directly on
data.edge_index/data.edge_attr.- alpha[source]#
Restart probability (teleport probability back to seed). Higher values keep samples closer to seeds. Typical values: 0.15-0.25.
- eps[source]#
Convergence threshold for the Forward Push algorithm. Smaller values give more accurate PPR scores but require more computation. Typical values: 1e-4 to 1e-6.
- num_neighbors_per_hop[source]#
Maximum number of neighbors fetched per node per edge type during PPR traversal. Set large to approximate fetching all neighbors.
- gigl.distributed.sampler_options.resolve_sampler_options(num_neighbors, sampler_options)[source]#
Resolve sampler_options from user-provided values.
If
sampler_optionsis aPPRSamplerOptions, returns it directly (num_neighborsis unused for PPR). Ifsampler_optionsisNone, wrapsnum_neighborsin aKHopNeighborSamplerOptions. IfKHopNeighborSamplerOptionsis provided, validates that itsnum_neighborsmatches the explicit value.- Parameters:
num_neighbors (Union[list[int], dict[graphlearn_torch.typing.EdgeType, list[int]]]) – Fanout per hop (required for KHop; ignored for PPR).
sampler_options (Optional[SamplerOptions]) – Sampler configuration, or None.
- Returns:
The resolved SamplerOptions.
- Raises:
ValueError – If
KHopNeighborSamplerOptions.num_neighborsconflicts with the explicitnum_neighbors.- Return type:
SamplerOptions