gigl.distributed.graph_store.storage_main#
Built-in GiGL Graph Store Server.
Derived from alibaba/graphlearn-for-pytorch
TODO(kmonte): Remove this, and only expose utils. We keep this around so we can use the utils in tests/integration/distributed/graph_store/graph_store_integration_test.py.
Attributes#
Functions#
|
Run a storage node process |
Module Contents#
- gigl.distributed.graph_store.storage_main.storage_node_process(storage_rank, cluster_info, task_config_uri, sample_edge_direction, splitter=None, tf_record_uri_pattern='.*-of-.*\\.tfrecord(\\.gz)?$', ssl_positive_label_percentage=None, storage_world_backend=None, timeout_seconds=None)[source]#
Run a storage node process
Should be called once per storage node (machine).
- Parameters:
storage_rank (int) – The rank of the storage node.
cluster_info (GraphStoreInfo) – The cluster information.
task_config_uri (Uri) – The task config URI.
sample_edge_direction (Literal["in", "out"]) – The sample edge direction.
splitter (Optional[Union[DistNodeAnchorLinkSplitter, DistNodeSplitter]]) – The splitter to use. If None, will not split the dataset.
tf_record_uri_pattern (str) – The TF Record URI pattern.
ssl_positive_label_percentage (Optional[float]) – The percentage of edges to select as self-supervised labels. Must be None if supervised edge labels are provided in advance. If 0.1 is provided, 10% of the edges will be selected as self-supervised labels.
storage_world_backend (Optional[str]) – The backend for the storage Torch Distributed process group.
timeout_seconds (Optional[float]) – The timeout seconds for the storage node process.
- Return type:
None