Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BlueBEAR cluster config #745

Merged
merged 9 commits into from
Sep 6, 2024
Merged

Conversation

alexlyttle
Copy link
Contributor


name: BlueBEAR
about: A new cluster config

Please follow these steps before submitting your PR:

  • If your PR is a work in progress, include [WIP] in its title
  • Your PR targets the master branch
  • You've included links to relevant issues, if any

Steps for adding a new config profile:

  • Add your custom config file to the conf/ directory
  • Add your documentation file to the docs/ directory
  • Add your custom profile to the nfcore_custom.config file in the top-level directory
  • Add your custom profile to the README.md file in the top-level directory
  • Add your profile name to the profile: scope in .github/workflows/main.yml

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments, if you disagree we can proceed.

Note, so you don't ave to wait for approval for the tests to run - consider joining the nf-core github organisation, instructions here: https://nf-co.re/join

Comment on lines 17 to 18
// Cluster-specific params
project = slurm_job_account
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having non-pipeline parameters here may fail some (and increasingly more pipelines).

A better approach is to tell users to set a bash environment variable with the information needed, and yse that information (in this case down in your clusterOptions).

For example, see

clusterOptions = "--account=${System.getenv('PAWSEY_PROJECT')}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that sounds good. In fact System.getenv("SLURM_JOB_ACCOUNT") could be substituted there, since it is unlikely someone would want individual processes to be submitted under a different account.

docs/bluebear.md Outdated
BlueBEAR comes with Apptainer installed for running processes inside containers. Our configuration makes use of this
when executing processes.

We advise you create a directory in which to cache images using the following environment variable. For example,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention it's a good idea to put the export command in a .bashrc or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

docs/bluebear.md Outdated
Comment on lines 80 to 82
Nextflow caches pipeline files in a work directory (default is `work`). This is useful if you need to resume a job
after a failure (using the `-resume` option). However, the `work` directory can fill up quickly. Regularly cleaning the
work directory avoids filling up your project quota.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could consider by default adding the cleanup = true 1 option to have the emptying of the work/ happen by default, and to override this use the debug profile in addition to the institutional profile.

For examples: https://github.com/search?q=repo%3Anf-core%2Fconfigs%20cleanup%20%3D%20true&type=code

Footnotes

  1. Docs: https://www.nextflow.io/docs/latest/config.html#miscellaneous

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little concerned about the warning in the Nextflow docs about this. It suggests cleanup = true prevents use of -resume. If cleanup only occurs on a successful run, is -resume still possible after a failed run?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only prevents -resume if the pipeline completes with exit code 0 (so no errors).

If the pipeline hits an error (exit code >1), all the files in work are retained and will not be deleted

The only case that it might be problematic is if someone is playing with parameters in later stages of a pipeline, and want to resume from midway but with a different parameter, yes in this case this resume would not work (if that makes sense)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah fantastic, I misunderstood the docs slightly. This sounds like a good addition, thanks!

@jfy133
Copy link
Member

jfy133 commented Sep 6, 2024

Good to go @alexlyttle !

With your new found nf-core org membership, you may do the honours! (At least I think you should be able to merge yourself now 😅)

@alexlyttle alexlyttle merged commit 8bbbfb9 into nf-core:master Sep 6, 2024
129 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants