gigl.src.common.models.layers.feature_interaction#
Classes#
| Derived from tensorflow_recommenders [implementation](https://www.tensorflow.org/recommenders/api_docs/python/tfrs/layers/dcn/Cross) | |
| Wraps around DCNCross for multi-layer feature crossing. See documentation for DCNCross for more details. | 
Module Contents#
- class gigl.src.common.models.layers.feature_interaction.DCNCross(in_dim, projection_dim=None, diag_scale=0.0, use_bias=True, **kwargs)[source]#
- Bases: - torch.nn.Module- Derived from tensorflow_recommenders [implementation](https://www.tensorflow.org/recommenders/api_docs/python/tfrs/layers/dcn/Cross) Cross Layer in Deep & Cross Network to learn explicit feature interactions. - A layer that creates explicit and bounded-degree feature interactions efficiently. The call method accepts inputs as a tuple of size 2 tensors. The first input x0 is the base layer that contains the original features (usually the embedding layer); the second input xi is the output of the previous DCNCross layer in the stack, i.e., the i-th DCNCross layer. For the first DCNCross layer in the stack, x0 = xi. - The output is x_{i+1} = x0 .* (W * xi + bias + diag_scale * xi) + xi, where .* designates elementwise multiplication, W could be a full-rank matrix, or a low-rank matrix U*V to reduce the computational cost, and diag_scale increases the diagonal of W to improve training stability (especially for the low-rank case). - References: - [R. Wang et al.](https://arxiv.org/pdf/2008.13535.pdf) See Eq. (1) for full-rank and Eq. (2) for low-rank version. - [R. Wang et al.](https://arxiv.org/pdf/1708.05123.pdf) - Parameters:
- in_dim (int) – The input feature dimension. 
- projection_dim (Optional[int]) – Projection dimension to reduce the computational cost. Default is None such that a full (in_dim by in_dim) matrix W is used. If enabled, a low-rank matrix W = U*V will be used, where U is of size in_dim by projection_dim and V is of size projection_dim by in_dim. projection_dim needs to be smaller than in_dim/2 to improve the model efficiency. In practice, we’ve observed that projection_dim = d/4 consistently preserved the accuracy of a full-rank version. 
- diag_scale (float) – A non-negative float used to increase the diagonal of the kernel W by diag_scale, that is, W + diag_scale * I, where I is an identity matrix. 
- use_bias (bool) – Whether to add a bias term for this layer. If set to False, no bias term will be used. 
 
 - Input shape:
- A tuple of 2 (batch_size, in_dim) dimensional inputs. 
- Output shape:
- A single (batch_size, in_dim) dimensional output. 
 - Initialize internal Module state, shared by both nn.Module and ScriptModule. - forward(x0, x=None)[source]#
- Computes the feature cross. Args: x0: The input tensor x: Optional second input tensor. If provided, the layer will compute - crosses between x0 and x; if not provided, the layer will compute crosses between x0 and itself. - Returns: Tensor of crosses. - Parameters:
- x0 (torch.Tensor) 
- x (Optional[torch.Tensor]) 
 
- Return type:
- torch.Tensor 
 
 
- class gigl.src.common.models.layers.feature_interaction.DCNv2(in_dim, num_layers=1, projection_dim=None, diag_scale=0.0, use_bias=True, **kwargs)[source]#
- Bases: - torch.nn.Module- Wraps around DCNCross for multi-layer feature crossing. See documentation for DCNCross for more details. - Parameters:
- in_dim (int) – The input feature dimension. 
- num_layers (int) – How many feature crossing layers to use. K layers will produce as high as (K+1)-order features. 
- projection_dim (Optional[int]) – Projection dimension to reduce the computational cost. Default is None such that a full (in_dim by in_dim) matrix W is used. If enabled, a low-rank matrix W = U*V will be used, where U is of size in_dim by projection_dim and V is of size projection_dim by in_dim. projection_dim needs to be smaller than in_dim/2 to improve the model efficiency. In practice, we’ve observed that projection_dim = d/4 consistently preserved the accuracy of a full-rank version. 
- diag_scale (float) – A non-negative float used to increase the diagonal of the kernel W by diag_scale, that is, W + diag_scale * I, where I is an identity matrix. 
- use_bias (bool) – Whether to add a bias term for this layer. If set to False, no bias term will be used. 
 
 - Initialize internal Module state, shared by both nn.Module and ScriptModule. 
