gigl.experimental.knowledge_graph_embedding.common.distributed#

Attributes#

Functions#

set_process_env_vars_for_torch_dist(...[, port])

This function sets the environment variables required for

Module Contents#

gigl.experimental.knowledge_graph_embedding.common.distributed.set_process_env_vars_for_torch_dist(process_number_on_current_machine, num_processes_on_current_machine, machine_context, port=29500)[source]#

This function sets the environment variables required for distributed training with PyTorch. It assumes a multi-machine setup where each machine has a number of processes running. The number of machines and rendevous is determined by the machine_context provided.

Parameters:
  • process_number_on_current_machine (int) – The process number on the current machine.

  • num_processes_on_current_machine (int) – The total number of processes on the current machine.

  • machine_context (DistributedContext) – The context containing information about the distributed setup.

  • port (int)

Returns:

A tuple containing:
  • local_rank (int): The local rank of the process on the current machine.

  • rank (int): The global rank of the process across all machines.

  • local_world_size (int): The number of processes on the current machine.

  • world_size (int): The total number of processes across all machines.

Return type:

Tuple[int, int, int, int]

gigl.experimental.knowledge_graph_embedding.common.distributed.logger[source]#