gigl.distributed.graph_store.storage_main#

Built-in GiGL Graph Store Server.

Derived from alibaba/graphlearn-for-pytorch

TODO(kmonte): Remove this, and only expose utils. We keep this around so we can use the utils in tests/integration/distributed/graph_store/graph_store_integration_test.py.

Attributes#

Functions#

storage_node_process(storage_rank, cluster_info, ...)

Run a storage node process

Module Contents#

gigl.distributed.graph_store.storage_main.storage_node_process(storage_rank, cluster_info, task_config_uri, sample_edge_direction, splitter=None, tf_record_uri_pattern='.*-of-.*\\.tfrecord(\\.gz)?$', ssl_positive_label_percentage=None, storage_world_backend=None, timeout_seconds=None)[source]#

Run a storage node process

Should be called once per storage node (machine).

Parameters:
  • storage_rank (int) – The rank of the storage node.

  • cluster_info (GraphStoreInfo) – The cluster information.

  • task_config_uri (Uri) – The task config URI.

  • sample_edge_direction (Literal["in", "out"]) – The sample edge direction.

  • splitter (Optional[Union[DistNodeAnchorLinkSplitter, DistNodeSplitter]]) – The splitter to use. If None, will not split the dataset.

  • tf_record_uri_pattern (str) – The TF Record URI pattern.

  • ssl_positive_label_percentage (Optional[float]) – The percentage of edges to select as self-supervised labels. Must be None if supervised edge labels are provided in advance. If 0.1 is provided, 10% of the edges will be selected as self-supervised labels.

  • storage_world_backend (Optional[str]) – The backend for the storage Torch Distributed process group.

  • timeout_seconds (Optional[float]) – The timeout seconds for the storage node process.

Return type:

None

gigl.distributed.graph_store.storage_main.logger[source]#
gigl.distributed.graph_store.storage_main.parser[source]#