Go to file

Hubert Siejkowski ca7506bd1d Pull request #1 : Improve README Merge in EAI/platform-demo-scripts from switching_to_pytorch_lightning_and_pick_benchmark_rev1 to switching_to_pytorch_lightning_and_pick_benchmark * commit '447b5cf5d58d83f0590448cee4a01a33361931bc': Improve README		2023-08-29 15:07:52 +02:00
experiments	initial commit with the pipeline for training and evaluating seisbench models	2023-08-29 09:59:31 +02:00
notebooks	initial commit	2023-07-05 09:58:06 +02:00
scripts	initial commit with the pipeline for training and evaluating seisbench models	2023-08-29 09:59:31 +02:00
utils	initial commit	2023-07-05 09:58:06 +02:00
.gitignore	initial commit with the pipeline for training and evaluating seisbench models	2023-08-29 09:59:31 +02:00
config.json	initial commit with the pipeline for training and evaluating seisbench models	2023-08-29 09:59:31 +02:00
poetry.lock	initial commit with the pipeline for training and evaluating seisbench models	2023-08-29 09:59:31 +02:00
pyproject.toml	initial commit with the pipeline for training and evaluating seisbench models	2023-08-29 09:59:31 +02:00
README.md	Improve README	2023-08-29 12:15:36 +02:00

README.md

Demo notebooks and scripts for EPOS AI Platform

This repo contains notebooks and scripts demonstrating how to:

Prepare IGF data for training a seisbench model detecting P phase (i.e. transform mseeds into SeisBench data format), check the notebook.
Explore available data, check the notebook
Train various cnn models available in seisbench library and compare their performance of detecting P phase, check the script
[to update] Validate model performance, check the notebook
[to update] Use model for detecting P phase, check the notebook

Acknowledgments

This code is based on the pick-benchmark, the repository accompanying the paper: Which picker fits my data? A quantitative evaluation of deep learning based seismic pickers

Before running

Please install Poetry, a tool for dependency management and packaging in Python. Then we will use only Poetry for creating Python environment and installing dependencies.

Usage

Install all dependencies with poetry, run:

poetry install

Prepare .env file with content:

WANDB_HOST="https://epos-ai.grid.cyfronet.pl/"
WANDB_API_KEY="your key"
WANDB_USER="your user"
WANDB_PROJECT="training_seisbench_models_on_igf_data"
BENCHMARK_DEFAULT_WORKER=2

Transform data into seisbench format. (unofficial)
- Download original data from the drive
- Run the notebook: utils/Transforming mseeds to SeisBench dataset.ipynb
Initialize poetry environment:

poetry shell
Run the pipeline script:

python pipeline.py

The script performs the following steps:
- Generates evaluation targets
  - Trains multiple versions of GPD, PhaseNet and ... models to find the best hyperparameters, producing the lowest validation loss. This step utilizes the Weights & Biases platform to perform the hyperparameters search (called sweeping) and track the training process and store the results. The results are available at
    https://epos-ai.grid.cyfronet.pl/<your user name>/<your project name>
- Uses the best performing model of each type to generate predictions
- Evaluates the performance of each model by comparing the predictions with the evaluation targets
- Saves the results in the scripts/pred directory
The default settings are saved in config.json file. To change the settings, edit the config.json file or pass the new settings as arguments to the script. For example, to change the sweep configuration file for GPD model, run: python pipeline.py --gpd_config <new config file> The new config file should be placed in the experiments or as specified in the configs_path parameter in the config.json file.

Troubleshooting

wandb: ERROR Run .. errored: OSError(24, 'Too many open files') -> https://github.com/wandb/wandb/issues/2825

Licence

TODO

Copyright

This work was partially funded by EPOS Project funded in frame of PL-POIR4.2