-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iss25 #43
Merged
Merged
Iss25 #43
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This script introduces new features to the tool, including the capability to process the climate datasets, including those consisting of multiple models, submodels (those with specific configuration sets), ensemble members, and multiple scenarios (SSPs). The parent calling script is in charge of parallelization scheme, if needed. With this script, a few issues related to the current deficiencies of datatool could be resolved simultaneously. Signed-off-by: Kasra Keshavarz <[email protected]>
Multi parallelization schemes are added, so the package not only submit array jobs based on the given date range and the chunk schemes, but also considers submitting jobs based on various models, ensemble members, and scenarios. These new parallelization schemes mostly applies to climate datasets, but not necessarily. This commit aims to save time for the user and fasten the processing time for datasets. This commit resolves issue #25 on remote GitHub hosting repository. Furthermore, it adds the ESPO dataset to the list of datasets as well. Moreover, a new option is implement to show the list of currently available datasets to the users. Signed-off-by: Kasra Keshavarz <[email protected]>
This is meant to clearly organize the information provided inside the package. The new file lists all the available datasets and the keyword that users can provide the `--dataset` option. Previously, this information was part of the main Usage message `--help` of the main script. Signed-off-by: Kasra Keshavarz <[email protected]>
1. the "function" keywords added to make the style compatible with that of Google's recommendations, 2. required arguments and options are revised alongside the relevant comments, 3. typos are fixed Signed-off-by: Kasra Keshavarz <[email protected]>
The script deals with the Climate Dataset produced by the Alberta Government. The dataset is not public yet, and is planned to be available soon. Signed-off-by: Kasra Keshavarz <[email protected]>
Since some hydrological models can use near-surface level or 40m level data, the necessary list of variables for both levels are added. Furthermore, a link to the official website for the dataset is added for further clarity. Signed-off-by: Kasra Keshavarz <[email protected]>
Since multiple HPCs are now used for the workflows, it is important to have consistent datasets synchronized regularly. Therefore, this commit attempts to reflect these efforts by creating consistent paths for various HPCs/allocations. Signed-off-by: Kasra Keshavarz <[email protected]>
In this commit, the following are addressed: * Correcting paths for the local scripts, * Renaming scripts to reflect the owner of the script for further clarification, * Adding parallelization schemes based on model, ensemble, and scenario, * Adding gcc/9.3.0 as the reference clib for the modules loaded to prevent mismatch between various environments defined on the HPCs, * Assuring ESPG:4326 is considered for the input shape file if there is no CRS defined, * Getting rid of \t characters in the help messages, * Correcting short help message to be more informative, * Adding function declarations to follow Google’s shell scripting guidelines, * Assuring --account=STR is described in the help message. Signed-off-by: Kasra Keshavarz <[email protected]>
Various files within this directory is categorized to be more informative for the users/devs. Signed-off-by: Kasra Keshavarz <[email protected]>
The README file for this dataset is added, offering necessary information for the users. Signed-off-by: Kasra Keshavarz <[email protected]>
This commit assures all dataset scripts follows the convention of <institute>-<dataset-name> under the `scripts` path. Furthermore, necessary adjusments on the styles of the scripts has been implemented, including: * adding `--model`, `--scenario`, and `--ensemble` options, if missing, for compatibility with the main caller script, as these options are given to the script by `extract-dataset.sh` script, * assuring scripting style follows that of Google's shell scripting guidelines, * the paths to the externally called scripts are properlly adjusted, after modifications to the structure of datatool's `assets` directory, and * minor changes to the source code to assure compatibility with the v0.5.0 of datatool. Signed-off-by: Kasra Keshavarz <[email protected]>
This commit addresses issue #27 by describing the NASA's NEX-GDDP-CMIP^ dataset and relevant scripts for it. Furthermore, it provides necessary information for users to enable them use `datatool` for extracting subsets of the dataset for any temporal and spatial extents. Signed-off-by: Kasra Keshavarz <[email protected]>
This commit addresses issue #27 and provides scripts to extract subset from NASA's NEX-GDDP-CMIP6 dataset. This script is capable to work with various models, scenarios, ensemble members, and variables offered by this dataset. Signed-off-by: Kasra Keshavarz <[email protected]>
This commit addresses issue #34 and processes this dataset that contains multiple GCM model outputs, including various sub-models, scenarios, ensemble members, and variables. Signed-off-by: Kasra Keshavarz <[email protected]>
Necessary information to use `datatool` for this script is provided to the user via the README.md file. Signed-off-by: Kasra Keshavarz <[email protected]>
With the growing number of scripts, this commit tries to restructure this directory to provide more clarity and organization for the users. Signed-off-by: Kasra Keshavarz <[email protected]>
The help message has been trimmed to provide more information to the users. This include values provided to the `--lon-lims` that must be within the [-180, +180] limits. This has not been mentioned before to the users and could have provided confusion, as there are multiple methods to describe longitudes. Furthermore, the list of datasets on the main page of the repository has been updated to reflect the most up-to-date list. Signed-off-by: Kasra Keshavarz <[email protected]>
kasra-keshavarz
added
bug
Something isn't working
documentation
Improvements or additions to documentation
enhancement
New feature or request
added dataset
new dataset being added to the script
new release
new release
labels
Mar 5, 2024
This was referenced Mar 6, 2024
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
added dataset
new dataset being added to the script
bug
Something isn't working
documentation
Improvements or additions to documentation
enhancement
New feature or request
new release
new release
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This Pull Request resolves the following issues: #40, #37 (partially), #36 (partially), #34, #27, #25.
The commits all have comprehensive messages describing the change the more importantly, the reason.
I do NOT like that GitHub does not provide character limitations for each line here.