gigl.utils.sampling#

Attributes#

Classes#

ABLPInputNodes

Represents ABLP (Anchor Based Link Prediction) input for a single storage server.

Functions#

parse_fanout(fanout_str)

Parses fanout from a string. The fanout string should be equivalent to a str(list[int]) or a

Module Contents#

class gigl.utils.sampling.ABLPInputNodes[source]#

Represents ABLP (Anchor Based Link Prediction) input for a single storage server.

This dataclass encapsulates all the information needed for ABLP sampling from a single storage node in the graph store distributed setup.

Fields:

anchor_nodes: 1D tensor of anchor node IDs to sample around. labels: Dict mapping a supervision EdgeType to a tuple of

(positive_labels, negative_labels). The positive labels tensor is a 2D tensor [N, M] of positive label node IDs, where N is the number of anchor nodes and M is the number of positive labels per anchor. The negative labels tensor is an optional 2D tensor [N, K] of negative label node IDs (None if no negative labels are available). The EdgeType is the supervision (message-passing) edge type (e.g. (“user”, “to”, “item”)).

anchor_node_type: The node type of the anchor nodes (e.g. “user”).

Should be set for heterogeneous graphs. Should be set to DEFAULT_HOMOGENEOUS_NODE_TYPE for labeled homogeneous graphs.

Example

For a user->item link prediction task with 3 anchor users:

ABLPInputNodes(
    anchor_nodes=tensor([0, 1, 2]),
    labels={
        ("user", "to", "item"): (
            tensor([[10, 11], [12, 13], [14, 15]]),
            tensor([[20], [21], [22]]),
        )
    },
    anchor_node_type="user",
)

For a labeled homogeneous graph (no negative labels):

ABLPInputNodes(
    anchor_nodes=tensor([0, 1, 2]),
    labels={
        ("__n__", "to", "__n__"): (
            tensor([[10, 11], [12, 13], [14, 15]]),
            None,
        )
    },
)
anchor_node_type: gigl.src.common.types.graph_data.NodeType[source]#
anchor_nodes: torch.Tensor[source]#
labels: dict[gigl.src.common.types.graph_data.EdgeType, tuple[torch.Tensor, torch.Tensor | None]][source]#
gigl.utils.sampling.parse_fanout(fanout_str)[source]#

Parses fanout from a string. The fanout string should be equivalent to a str(list[int]) or a str(dict[tuple[str, str, str], list[int]]), where each item in the tuple corresponds to the source node type, relation, and destination node type, respectively.

For example, to parse a list[int], one could provide a fanout_str such as

‘[10, 15, 20]’

To parse a dict[EdgeType, list[int]], one could provide a fanout_str such as

‘{(“user”, “to”, “user”): [10, 10], (“user”, “to”, “item”): [20, 20]}’

Parameters:

fanout_str (str) – Provided string to be parsed into fanout

Returns:

Either a list of fanout per hop of a dictionary of edge types to their respective fanouts per hop

Return type:

Union[list[int], dict[EdgeType, list[int]]]

gigl.utils.sampling.logger[source]#