Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config with pydantic #334

Closed
wants to merge 21 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ repos:
- id: trailing-whitespace
args: [--markdown-linebreak-ext=md]
- repo: https://github.com/adrienverge/yamllint
rev: "v1.26.0"
rev: "v1.29.0"
hooks:
- id: yamllint
- repo: https://github.com/asottile/setup-cfg-fmt
Expand All @@ -23,7 +23,7 @@ repos:
hooks:
- id: black-jupyter
- repo: https://github.com/PyCQA/isort
rev: "5.9.3"
rev: 5.12.0
hooks:
- id: isort
# TODO renable when errors are fixed/ignored
Expand Down Expand Up @@ -78,7 +78,7 @@ repos:
rev: 1.1.0
hooks:
- id: nbqa-isort
additional_dependencies: [isort==5.9.3]
additional_dependencies: [isort==5.11.2]
- id: nbqa-mypy
additional_dependencies: [mypy==0.910, types-python-dateutil]
# TODO renable when errors are fixed/ignored
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ Formatted as described on [https://keepachangelog.com](https://keepachangelog.co

- Apptainer support ([#290](https://github.com/eWaterCycle/ewatercycle/issues/290))

### Changed

- ewatercycle config now uses Pydantic instead of matplotlib inspired validation. ([#332](https://github.com/eWaterCycle/ewatercycle/issues/332))

### Deprecated

- Singularity support
Expand Down
25 changes: 11 additions & 14 deletions docs/system_setup.rst
Original file line number Diff line number Diff line change
Expand Up @@ -172,25 +172,22 @@ The configuration can be set in Python with
import logging
logging.basicConfig(level=logging.INFO)
import ewatercycle
import ewatercycle.parameter_sets
# Which container engine is used to run the hydrological models
ewatercycle.CFG['container_engine'] = 'apptainer' # or 'docker'
ewatercycle.CFG.container_engine = 'apptainer' # or 'docker'
# If container_engine==apptainer then where can the Apptainer images files (*.sif) be found.
ewatercycle.CFG['apptainer_dir'] = './apptainer-images'
ewatercycle.CFG.apptainer_dir = './apptainer-images'
# Directory in which output of model runs is stored. Each model run will generate a sub directory inside output_dir
ewatercycle.CFG['output_dir'] = './'
ewatercycle.CFG.output_dir = './'
# Where can GRDC observation files (<station identifier>_Q_Day.Cmd.txt) be found.
ewatercycle.CFG['grdc_location'] = './grdc-observations'
ewatercycle.CFG.grdc_location = './grdc-observations'
# Where can parameters sets prepared by the system administator be found
ewatercycle.CFG['parameterset_dir'] = './parameter-sets'
# Where is the configuration saved or loaded from
ewatercycle.CFG['ewatercycle_config'] = './ewatercycle.yaml'
ewatercycle.CFG.parameterset_dir = './parameter-sets'

and then written to disk with

.. code:: ipython3

ewatercycle.CFG.save_to_file()
ewatercycle.CFG.save_to_file('./ewatercycle.yaml')

Later it can be loaded by using:

Expand Down Expand Up @@ -269,11 +266,11 @@ Apptainer
~~~~~~~~~

Apptainer images should be stored in configured directory
(``ewatercycle.CFG['apptainer_dir']``) and can build from Docker with:
(``ewatercycle.CFG.apptainer_dir``) and can build from Docker with:

.. code:: shell

cd {ewatercycle.CFG['apptainer_dir']}
cd {ewatercycle.CFG.apptainer_dir}
apptainer build ewatercycle-lisflood-grpc4bmi_20.10.sif docker://ewatercycle/lisflood-grpc4bmi:20.10
apptainer build ewatercycle-marrmot-grpc4bmi_2020.11.sif docker://ewatercycle/marrmot-grpc4bmi:2020.11
apptainer build ewatercycle-pcrg-grpc4bmi_setters.sif docker://ewatercycle/pcrg-grpc4bmi:setters
Expand Down Expand Up @@ -389,8 +386,8 @@ A new parameter set should be added as a key/value pair in the ``parameter_sets`
The key should be a unique string on the current system.
The value is a dictionary with the following items:

* directory: Location on disk where files of the parameter set are stored. If Path is relative then relative to :py:const:`ewatercycle.CFG['parameterset_dir']`.
* config: Model configuration file which uses files from directory. If Path is relative then relative to :py:const:`ewatercycle.CFG['parameterset_dir']`.
* directory: Location on disk where files of the parameter set are stored. If Path is relative then relative to :py:const:`ewatercycle.CFG.parameterset_dir`.
* config: Model configuration file which uses files from directory. If Path is relative then relative to :py:const:`ewatercycle.CFG.parameterset_dir`.
* doi: Persistent identifier of the parameter set. For example a DOI for a Zenodo record.
* target_model: Name of the model that parameter set can work with
* supported_model_versions: Set of model versions that are supported by this parameter set. If not set then parameter set will be supported by all versions of model
Expand Down Expand Up @@ -432,5 +429,5 @@ Services (USGS) <https://waterservices.usgs.gov/>`__ data.
The GRDC daily data files can be ordered at
https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/riverdischarge_node.html.

The GRDC files should be stored in ``ewatercycle.CFG['grdc_location']``
The GRDC files should be stored in ``ewatercycle.CFG.grdc_location``
directory.
3 changes: 2 additions & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@ name: ewatercycle
channels:
- conda-forge
dependencies:
- python>=3.8
# Numba 0.56.4 does not support Python 3.11 so set upper limit
- python>=3.8,<3.11
- esmvaltool-python>=2.3.0
- subversion
# Pin esmpy so we dont get forced to run all parallel tasks on single cpu
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ install_requires =
numpy
pandas
protobuf<=3.20.1
pydantic
pyoos
python-dateutil
ruamel.yaml
Expand Down Expand Up @@ -83,7 +84,6 @@ dev =
types-python-dateutil

[options.package_data]
* = *.yaml
ewatercycle = py.typed

[coverage:run]
Expand Down
1 change: 0 additions & 1 deletion src/ewatercycle/analysis/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,6 @@ def hydrograph(

# Add precipitation as bar plot to the top if specified
if precipitation is not None:

if nbars is not None:
precipitation, barwidth = _downsample(
precipitation, nrows=nbars, agg="mean"
Expand Down
48 changes: 29 additions & 19 deletions src/ewatercycle/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,22 +9,26 @@

>>> from ewatercycle import CFG
>>> CFG
Config({'container_engine': None,
'grdc_location': None,
'output_dir': None,
'apptainer_dir': None,
})

By default all values are initialized as ``None``.

:py:data:`~ewatercycle.CFG` is essentially a python dictionary with a few extra
functions, similar to :py:mod:`matplotlib.rcParams`. This means that values can
be updated like this:
Configuration(
grdc_location=PosixPath('.'),
container_engine='docker',
apptainer_dir=PosixPath('.'),
singularity_dir=None,
output_dir=PosixPath('.'),
parameterset_dir=PosixPath('.'),
parameter_sets={},
ewatercycle_config=None
)

By default all values have usable values.

:py:data:`~ewatercycle.CFG` is a `Pydantic model <https://docs.pydantic.dev/usage/models/>`_.
This means that values can be updated like this:

.. code-block:: python

>>> CFG['output_dir'] = '~/output'
>>> CFG['output_dir']
>>> CFG.output_dir = '~/output'
>>> CFG.output_dir
PosixPath('/home/user/output')

Notice that :py:data:`~ewatercycle.CFG` automatically converts the path to an
Expand All @@ -34,15 +38,21 @@

.. code-block:: python

>>> CFG['output_directory'] = '~/output'
InvalidConfigParameter: `output_directory` is not a valid config parameter.
>>> CFG.output_directory = '/output'
ValidationError: 1 validation error for Configuration
output_directory
extra fields not permitted (type=value_error.extra)


Or, if the value entered cannot be converted to the expected type:

.. code-block:: python

>>> CFG['output_dir'] = 123
InvalidConfigParameter: Key `output_dir`: Expected a path, but got 123
>>> CFG.output_dir = 123
ValidationError: 1 validation error for Configuration
output_dir
value is not a valid path (type=type_error.path)


By default, the config is loaded from the default location (i.e.
``~/.config/ewatercycle/ewatercycle.yaml``). If it does not exist, it falls back
Expand Down Expand Up @@ -81,6 +91,6 @@
# apptainer pull docker://ewatercycle/wflow-grpc4bmi:2020.1.1
"""

from ._config_object import CFG, DEFAULT_CONFIG, SYSTEM_CONFIG, USER_HOME_CONFIG, Config
from ._config_object import CFG, SYSTEM_CONFIG, USER_HOME_CONFIG, Configuration

__all__ = ["CFG", "Config", "DEFAULT_CONFIG", "SYSTEM_CONFIG", "USER_HOME_CONFIG"]
__all__ = ["CFG", "Configuration", "SYSTEM_CONFIG", "USER_HOME_CONFIG"]
Comment on lines +94 to +96
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move documentation to the user guide, and reduce the config to a single-file module called config.py?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the content of src/ewatercycle/config/_config_object.py to src/ewatercycle/config/init.py in c86f3b7.
I also added a link from docs/system_setup.rst to api docs.
To not increase the public API I prefixed private things with underscore.

Moving it to src/ewatercycle/config.py could be done once src/ewatercycle/config/_lisflood_versions.py has been moved to its own plugin/repo.

Loading