PyG-SSL Training ===================== Trainer ------------------- .. class:: BaseTrainer(method: BaseMethod, data_loader: DataLoader, save_root: str = "./ckpt", device: Union[str, int] = "cpu") Base class for self-supervised learning methods. Parameters: ----------- - **method** (BaseMethod): the entire method, including encoders and other components (e.g. discriminators) - **data_loader** (DataLoader): the data loader for the training data - **save_root** (str): the root directory to save the model checkpoints - **device** (str): the device to run the training process .. method:: train() Train the model. .. method:: save(path: optinal[str] = None) Save the model. .. method:: load(path: optinal[str] = None) Load the model parameters. .. class:: SimpleTrainer(method: BaseMethod, data_loader: DataLoader, lr: float = 0.001, weight_decay: float = 0.0, n_epochs: int = 10000, patience: int = 50, device: Union[str, int] = "cuda:0", save_root: str = "./ckpt", dataset=None) A simple trainer for self-supervised learning methods. Parameters: ----------- - **method** (BaseMethod): the entire method, including encoders and other components (e.g. discriminators) - **data_loader** (DataLoader): the data loader for the training data - **lr** (float): the learning rate for the optimizer - **weight_decay** (float): the weight decay for the optimizer - **n_epochs** (int): the number of epochs for training - **patience** (int): the patience for early stopping - **device** (str): the device to run the training process - **save_root** (str): the root directory to save the model checkpoints - **dataset** (Optional[Dataset]): the dataset for the training data .. method:: train() Train the model. .. method:: save(path: optinal[str] = None) Save the model. .. method:: load(path: optinal[str] = None) Load the model parameters. .. class:: NonContrastTrainer(method: BaseMethod, data_loader: DataLoader, lr: float = 0.001, weight_decay: float = 0.0, n_epochs: int = 10000, patience: int = 50, device: Union[str, int] = "cuda:0", use_ema: bool = False, moving_average_decay: float = 0.9, save_root: str = "./ckpt", dataset=None) A trainer for self-supervised learning methods that are not contrastive (BGRL and AFGRL). Parameters: ----------- - **method** (BaseMethod): the entire method, including encoders and other components (e.g. discriminators) - **data_loader** (DataLoader): the data loader for the training data - **lr** (float): the learning rate for the optimizer - **weight_decay** (float): the weight decay for the optimizer - **n_epochs** (int): the number of epochs for training - **patience** (int): the patience for early stopping - **device** (str): the device to run the training process - **use_ema** (bool): whether to use exponential moving average - **moving_average_decay** (float): the decay rate for the moving average - **save_root** (str): the root directory to save the model checkpoints - **dataset** (Optional[Dataset]): the dataset for the training data .. method:: train() Train the model. .. method:: save(path: optinal[str] = None) Save the model. .. method:: load(path: optinal[str] = None) Load the model parameters. .. class:: GCATrainer(method: BaseMethod, data_loader: DataLoader, lr: float = 0.001, weight_decay: float = 0.0, n_epochs: int = 5000, patience: int = 50, drop_scheme: str = 'degree', dataset_name: str = 'WikiCS', device: Union[str, int] = "cuda:1", save_root: str = "./ckpt") A trainer for GCA. Parameters: ----------- - **method** (BaseMethod): the entire method, including encoders and other components (e.g. discriminators) - **data_loader** (DataLoader): the data loader for the training data - **lr** (float): the learning rate for the optimizer - **weight_decay** (float): the weight decay for the optimizer - **n_epochs** (int): the number of epochs for training - **patience** (int): the patience for early stopping - **drop_scheme** (str): the drop scheme for the data augmentation - **dataset_name** (str): the name of the dataset - **device** (str): the device to run the training process - **save_root** (str): the root directory to save the model checkpoints .. method:: train() Train the model. .. method:: save(path: optinal[str] = None) Save the model. .. method:: load(path: optinal[str] = None) Load the model parameters. Loss ------------------- .. class:: LocalGlobalLoss(in_channels, sim_function,) Estimate the negative mutual information loss for the specified neural similarity function (sim_function) and the inputs x, y, x_ind, y_ind. (x, y) is sampled from the joint distribution (x, y)~p(x, y). x_ind and y_ind are independently sampled from their marginal distributions x_ind~p(x), y_ind~p(y). Reference: Deep Graph Infomax. Parameters: ----------- - **in_channels** (int): input channels to the neural mutual information estimator. - **sim_function** (Callable): the neural similarity measuring function. The default is the bilinear similarity function used by DGI .. method:: forward(l_enc, g_enc, batch, measure) Compute the negative mutual information loss. Args: ------- - **l_enc** (torch.Tensor): the local encoder output - **g_enc** (torch.Tensor): the global encoder output - **batch** (torch.Tensor): the batch of nodes - **measure** (str): the measure to compute the loss. The default is "local_global" .. class:: SCELoss() .. method:: forward(x, y, alpha) Compute the supervised contrastive loss. Args: ------- - **x** (torch.Tensor): the input tensor - **y** (torch.Tensor): the target tensor - **alpha** (float): the temperature parameter .. class:: SIGLoss() .. method:: forward(x, y) Compute the supervised infomax loss. Args: ------- - **x** (torch.Tensor): the input tensor - **y** (torch.Tensor): the target tensor .. class:: NegativeMI(in_channels: Optional[int] = None, sim_func: Optional[torch.nn.Module] = None) Negative Mutual Information. Estimate the negative mutual information loss for the specified neural similarity function (sim_function) and the inputs x, y, x_ind, y_ind. (x, y) is sampled from the joint distribution (x, y)~p(x, y). x_ind and y_ind are independently sampled from their marginal distributions x_ind~p(x), y_ind~p(y). Reference: Deep Graph Infomax. Parameters: ----------- - **in_channels** (int): input channels to the neural mutual information estimator. - **sim_function** (Callable): the neural similarity measuring function. The default is the bilinear similarity function used by DGI Evaluation ------------------- .. class:: K_Means(k: int = 8, average_method: str = "arithmetic", init = "k-means++", n_init = 10, max_iter: int = 300, tol: float = 1e-4, verbose: int = 0, random_state = None, copy_x: bool = True, algorithm: str = "auto", n_run: int = 50, device: Union[str, int] = "cuda") K-Means clustering. Parameters: ----------- - **k** (int): the number of clusters - **average_method** (str): the method to compute the average of the clusters - **init** (str): the initialization method - **n_init** (int): the number of initializations - **max_iter** (int): the maximum number of iterations - **tol** (float): the tolerance - **verbose** (int): the verbosity - **random_state** (int): the random state - **copy_x** (bool): whether to copy the data - **algorithm** (str): the algorithm to use - **n_run** (int): the number of runs - **device** (str): the device to run the training process .. method:: __call__(embs, dataset) Evaluate the embeddings. Args: ------- - **embs** (torch.Tensor): the embeddings - **dataset** (Dataset): the dataset .. method:: single_run(embs, labels) Evaluate the embeddings (more elegantly). Args: ------- - **embs** (torch.Tensor): the embeddings - **labels** (torch.Tensor): the labels .. class:: LogisticRegression(lr: float = 0.01, weight_decay: float = 0., max_iter: int = 100, n_run: int = 50, device: Union[str, int] = "cuda") Logistic Regression. Parameters: ----------- - **lr** (float): the learning rate - **weight_decay** (float): the weight decay - **max_iter** (int): the maximum number of iterations - **n_run** (int): the number of runs - **device** (str): the device to run the training process .. method:: __call__(embs, dataset) Evaluate the embeddings. Args: ------- - **embs** (torch.Tensor): the embeddings - **dataset** (Dataset): the dataset .. method:: single_run(embs, labels, train_mask, val_mask, test_mask) Evaluate the embeddings (more elegantly). Args: ------- - **embs** (torch.Tensor): the embeddings - **labels** (torch.Tensor): the labels - **train_mask** (torch.Tensor): the train mask - **val_mask** (torch.Tensor): the validation mask - **test_mask** (torch.Tensor): the test mask .. class:: RandomForestClassifier(search: bool = False, n_estimators: int = 1, criterion: str = 'gini', max_depth: int = None, min_samples_split: int = 2, min_samples_leaf: int = 1, min_weight_fraction_leaf: float = 0.0, \ max_features: Union[int, float, str, None] = 'auto', max_leaf_nodes: int = None, min_impurity_decrease: float = 0.0, bootstrap: bool = True, obb_score: bool = False, n_jobs: int = 1, random_state: Union[int, None] = None, \ verbose: int = 0, warm_start: bool = False, class_weight: dict = None, n_run: int = 50, device: Union[str, int] = "cuda") The Random Forest Classifier. Parameters: ----------- - **search** (bool): whether to search for the best hyperparameters - **n_estimators** (int): the number of trees in the forest - **criterion** (str): the function to measure the quality of a split - **max_depth** (int): the maximum depth of the tree - **min_samples_split** (int): the minimum number of samples required to split an internal node - **min_samples_leaf** (int): the minimum number of samples required to be at a leaf node - **min_weight_fraction_leaf** (float): the minimum weighted fraction of the sum total of weights - **max_features** (Union[int, float, str, None]): the number of features to consider when looking for the best split - **max_leaf_nodes** (int): the maximum number of leaf nodes - **min_impurity_decrease** (float): the minimum impurity decrease required for a split - **bootstrap** (bool): whether bootstrap samples are used when building trees - **obb_score** (bool): whether to use out-of-bag samples to estimate the generalization accuracy - **n_jobs** (int): the number of jobs to run in parallel - **random_state** (int): the random state - **verbose** (int): the verbosity - **warm_start** (bool): whether to reuse the solution of the previous call to fit - **class_weight** (dict): the class weight - **n_run** (int): the number of runs - **device** (str): the device to run the training process .. method:: __call__(embs, dataset) Evaluate the embeddings. Args: ------- - **embs** (torch.Tensor): the embeddings - **dataset** (Dataset): the dataset .. method:: single_run(embs, labels, train_mask, val_mask, test_mask) Evaluate the embeddings (more elegantly). Args: ------- - **embs** (torch.Tensor): the embeddings - **labels** (torch.Tensor): the labels - **train_mask** (torch.Tensor): the train .. class:: SVCRegression(C: float = 1.0, search: bool = True, kernel: str = 'rbf', degree: int = 3,gamma: str='auto', coef0: float = 0.0, shrinking: bool = True, probability: bool = False, tol: float = 0.001, cache_size: int = 200, \ class_weight: dict = None, verbose: bool = False, max_iter: int = -1, decision_function_shape: str = 'ovr', random_state: int = None, n_run: int = 50, device: Union[str, int] = "cuda") The Support Vector Classifier. Parameters: ----------- - **C** (float): the regularization parameter - **search** (bool): whether to search for the best hyperparameters - **kernel** (str): the kernel type - **degree** (int): the degree of the polynomial kernel function - **gamma** (str): the kernel coefficient - **coef0** (float): the independent term in the kernel function - **shrinking** (bool): whether to use the shrinking heuristic - **probability** (bool): whether to enable probability estimates - **tol** (float): the tolerance - **cache_size** (int): the cache size - **class_weight** (dict): the class weight .. method:: __call__(embs, dataset) Evaluate the embeddings. Args: ------- - **embs** (torch.Tensor): the embeddings - **dataset** (Dataset): the dataset .. method:: single_run(embs, labels, train_mask, val_mask, test_mask) Evaluate the embeddings (more elegantly). Args: ------- - **embs** (torch.Tensor): the embeddings - **labels** (torch.Tensor): the labels - **train_mask** (torch.Tensor): the train .. class:: SimSearch(sim_list: list = [5, 10, 20, 50, 100], n_run: int = 50, device: Union[str, int] = "cuda") Similarity Search. Parameters: ----------- - **sim_list** (list): the list of similarity values - **n_run** (int): the number of runs - **device** (str): the device to run the training process .. method:: __call__(embs, dataset) Evaluate the embeddings. Args: ------- - **embs** (torch.Tensor): the embeddings - **dataset** (Dataset): the dataset .. method:: single_run(embs, labels) Evaluate the embeddings (more elegantly). Args: ------- - **embs** (torch.Tensor): the embeddings - **labels** (torch.Tensor): the labels