finetuning (#1)

Reviewed-on: #1
Extended Phasenet finetuning options
This commit was merged in pull request #1.
This commit is contained in:
2024-07-29 13:41:05 +02:00
parent 5c3ce04868
commit e86f131cc0
11 changed files with 381 additions and 54 deletions
+21 -4
View File
@@ -111,7 +111,7 @@ After adjusting the grant name, the paths to conda env and the paths to data sen
The script performs the following steps:
1. Generates evaluation targets in `datasets/<dataset_name>/targets` directory.
1. Trains multiple versions of GPD, PhaseNet and ... models to find the best hyperparameters, producing the lowest validation loss.
1. Trains multiple versions of GPD, PhaseNet, BasicPhaseAE, and EQTransformer models to find the best hyperparameters, producing the lowest validation loss.
This step utilizes the Weights & Biases platform to perform the hyperparameters search (called sweeping) and track the training process and store the results.
The results are available at
@@ -126,14 +126,31 @@ After adjusting the grant name, the paths to conda env and the paths to data sen
The results are saved in the `scripts/pred/results.csv` file. They are additionally logged in Weights & Biases platform as summary metrics of corresponding runs.
<br/>
The default settings are saved in config.json file. To change the settings, edit the config.json file or pass the new settings as arguments to the script. For example, to change the sweep configuration file for GPD model, run:
The default settings for max number of experiments and paths are saved in config.json file. To change the settings, edit the config.json file or pass the new settings as arguments to the script. For example, to change the sweep configuration file for the GPD model, run:
```python pipeline.py --gpd_config <new config file>```
The new config file should be placed in the `experiments` folder or as specified in the `configs_path` parameter in the config.json file.
Sweep configs are used to define the max number of epochs to run and the hyperparameters search space for the following parameters:
* `batch_size`
* `learning_rate`
Phasenet model has additional available parameters:
* `norm` - normalization method, options ('peak', 'std')
* `pretrained` - pretrained seisbench models used for transfer learning
* `finetuning` - the type of layers to finetune first, options ('all', 'top', 'encoder', 'decoder')
* `lr_reduce_factor` - factor to reduce learning rate after unfreezing layers
GPD model has additional parameters for filtering:
* `highpass` - highpass filter frequency
* `lowpass` - lowpass filter frequency
The sweep configs are saved in the `experiments` folder.
If you have multiple datasets, you can run the pipeline for each dataset separately by specifying the dataset name as an argument:
```python pipeline.py --dataset <dataset_name>```
### Troubleshooting