Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Analysis class with JEDI class #2789

Closed

Conversation

DavidNew-NOAA
Copy link
Contributor

@DavidNew-NOAA DavidNew-NOAA commented Jul 24, 2024

Description

This PR replaces the PyGFS Analysis class with a more general JEDI class. It is a child class of the wxflow Task class, like the Analysis, but it's intended for any Global Workflow job that runs one or more JEDI applications. At this point, it has only been applied to the atmospheric variational and ensemble analysis jobs, so the Analysis class still remains part of the code base, but can be fully replaced after this PR using the atmospheric analysis jobs as a template for the other analysis jobs (snow, aerosol, etc).

In order to make things more generic, the input YAML will now always be named after the JEDI application being run. So, for example, if running gdas.x, the name of the input YAML is gdas.yaml, rather than, say, gdas.t12z.atmvar.yaml. In the case of the atmospheric variational analysis, the finalize() method of the AtmAnalysis class will ensure that gdas.yaml is saved as gdas.t12z.atmvar.yaml in the COMROT directory anyway, so no need to give it such a descriptive name in the run directory.

The AtmAnalysis and AtmEnsAnalysis classes, now subclasses of JEDI, now take care of staging observations and bias corrections in their own initialize() methods, rather than in the initialize() method of JEDI, but the JEDI initalize() method continues to render the input Jinja2-YAMLS/JCB templates and link the JEDI executable file. The JEDI class initalize() method now also saves the final YAML in the run directory, rather than having the initalize() method of the AtmAnalysis and AtmEnsAnalysis classes do that (since all JEDI applications require an input YAML file saved to disk.

A generalized execute() method now exists in JEDI which takes the APRUN command as input, and optionally a list of additional arguments that might be passed to gdas.x (eg. ['fv3jedi', 'variational']).

Type of change

  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? YES
  • Does this change require a documentation update? NO

How has this been tested?

  • Build on Hera
  • Run cycling experiment

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • I have made corresponding changes to the documentation if necessary

Copy link
Contributor

@guillaumevernieres guillaumevernieres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @DavidNew-NOAA . Just a few thoughts/comments.

logger = getLogger(__name__.split('.')[-1])


class JEDI(Task):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a class hard-wired for the analysis step, which I think is fine but the name should probably reflect this. JEDIAnalysis? or just keep Analysis?
I think @CoryMartin-NOAA (or maybe that was you @DavidNew-NOAA ?) started a b-matrix base class as well. We probably want to keep these separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is for this class to be used anytime a JEDI application is run in general, not just for running an analysis

self.link_jedi_exe()

@logit(logger)
def execute(self, aprun_cmd: str, jedi_args: Optional[List] = None) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running only one executable wouldn't work for the "B-matrix" or "prep obs" jobs for example, thus the suggestion above to rename to something descriptive of the analysis job.

save_as_yaml(self.task_config.jedi_config, self.task_config.jedi_yaml)
logger.info(f"Wrote YAML to: {self.task_config.jedi_yaml}")

def link_jedi_exe(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably should have this in a common utility module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree



@logit(logger)
def find_value_in_nested_dict(nested_dict: Dict, target_key: str) -> Any:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, probably belongs to wxflow or some other common utility module

@DavidNew-NOAA
Copy link
Contributor Author

I'm closing the pull request until I make some modifications. Rahul and I discussed it and agree, based on previous conversations with Dan and Cory, that the JEDI class should not be inherited from the Task class, but rather any child class of Task (eg AtmAnalysis, AtmEnsAnalysis, etc) should have the JEDI class as a member.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants