gigl.experimental.knowledge_graph_embedding.lib.config.training#
Classes#
Configuration for model checkpointing during training. |
|
Configuration for distributed training across multiple GPUs or processes. |
|
Configuration for early stopping based on validation performance. |
|
Configuration for training progress logging. |
|
Configuration for separate optimizers for sparse and dense parameters. |
|
Configuration for optimizer hyperparameters. |
|
Main training configuration that orchestrates all training-related settings. |
Module Contents#
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.CheckpointingConfig[source]#
Configuration for model checkpointing during training.
- save_every[source]#
Save a checkpoint every N training steps. Allows recovery from failures and monitoring of training progress. Defaults to 10,000 steps.
- Type:
int
- should_save_async[source]#
Whether to save checkpoints asynchronously to avoid blocking training. Improves training efficiency but may use additional memory. Defaults to True.
- Type:
bool
- load_from_path[source]#
Path to a checkpoint file to resume training from. If None, training starts from scratch. Defaults to None.
- Type:
Optional[str]
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.DistributedConfig[source]#
Configuration for distributed training across multiple GPUs or processes.
- num_processes_per_machine[source]#
Number of training processes to spawn per machine. Each process typically uses one GPU. Defaults to torch.cuda.device_count() if CUDA is available, otherwise 1.
- Type:
int
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.EarlyStoppingConfig[source]#
Configuration for early stopping based on validation performance.
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.LoggingConfig[source]#
Configuration for training progress logging.
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.OptimizerConfig[source]#
Configuration for separate optimizers for sparse and dense parameters.
Knowledge graph embedding models typically have both sparse embeddings (updated only for nodes/edges in each batch) and dense parameters (updated every batch). Different learning rates are often beneficial for these parameter types.
- sparse[source]#
Optimizer parameters for sparse embeddings (for nodes). Defaults to OptimizerParamsConfig(lr=0.01, weight_decay=0.001).
- Type:
- dense[source]#
Optimizer parameters for dense model parameters (linear layers, etc.). Defaults to OptimizerParamsConfig(lr=0.01, weight_decay=0.001).
- Type:
- sparse: OptimizerParamsConfig[source]#
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.OptimizerParamsConfig[source]#
Configuration for optimizer hyperparameters.
- lr[source]#
Learning rate for the optimizer. Controls the step size during gradient descent. Higher values lead to faster convergence but may overshoot the minimum. Defaults to 0.001.
- Type:
float
- class gigl.experimental.knowledge_graph_embedding.lib.config.training.TrainConfig[source]#
Main training configuration that orchestrates all training-related settings.
This configuration combines optimization, data loading, distributed training, checkpointing, and monitoring settings for knowledge graph embedding training.
- max_steps[source]#
Maximum number of training steps to perform. If None, training continues until early stopping or manual interruption. Defaults to None.
- Type:
Optional[int]
- early_stopping[source]#
Configuration for early stopping based on validation metrics. Defaults to EarlyStoppingConfig() with no patience limit.
- Type:
- dataloader[source]#
Configuration for data loading (number of workers, memory pinning). Defaults to DataloaderConfig() with standard settings.
- Type:
- sampling[source]#
Configuration for negative sampling strategy during training. Defaults to SamplingConfig() with standard settings.
- Type:
- optimizer[source]#
Configuration for separate sparse and dense optimizers. Defaults to OptimizerConfig() with standard settings.
- Type:
- distributed[source]#
Configuration for multi-GPU/multi-process training. Defaults to DistributedConfig() with auto-detected GPU count.
- Type:
- checkpointing[source]#
Configuration for saving and loading model checkpoints. Defaults to CheckpointingConfig() with standard settings.
- Type:
- logging[source]#
Configuration for training progress logging frequency. Defaults to LoggingConfig() with log-every-step setting.
- Type:
- checkpointing: CheckpointingConfig[source]#
- dataloader: gigl.experimental.knowledge_graph_embedding.lib.config.dataloader.DataloaderConfig[source]#
- distributed: DistributedConfig[source]#
- early_stopping: EarlyStoppingConfig[source]#
- logging: LoggingConfig[source]#
- optimizer: OptimizerConfig[source]#