-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Headless release #5554
base: master
Are you sure you want to change the base?
Headless release #5554
Conversation
Signed-off-by: Tom Sellman <[email protected]>
Signed-off-by: Tom Sellman <[email protected]>
Signed-off-by: Tom Sellman <[email protected]>
The target repository is an S3 bucket, jars are no longer published to maven central repo.x Signed-off-by: Tom Sellman <[email protected]>
Signed-off-by: Tom Sellman <[email protected]>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Signed-off-by: Tom Sellman <[email protected]>
… release notes Signed-off-by: Tom Sellman <[email protected]>
8c67025
to
19f01c4
Compare
I guess this should marked ready "Ready for review", right? |
The release script should make the commit, the other way around, it should be assumed the script is invoked by the release commit. |
I think that is how it works. The release commit is still what triggers the Github Action to perform the release. I just created a little
I think this local helper script is useful because it provides a nice "entrypoint" to the release process, and acts as documentation for both the required manual steps and how the workflow is triggered. Over time we can gradually improve the local helper script to automate more of the remaining manual steps. For example, it could prompt for which plugins to release and what versions they should be, and then automatically update the relevant manifest files before creating the release commit. |
Fair enough, it may be useful. Then what's the main entry for the headless part? |
The headless part is in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks an excellent work. Is this uploading the plugins to the respective GH projects?
Somehow unrelated, it loos like the DCO bot is not happy with your commit signature.
Yes, to control the scope of the changes plugins are still uploaded to their own Github projects. I think we should treat alternative upload destinations as a separate piece of work. |
Signed-off-by: Tom Sellman <[email protected]>
1b3bfea
to
efbb785
Compare
buildSrc/src/main/groovy/io/nextflow/gradle/tasks/GithubRepositoryPublisher.groovy
Outdated
Show resolved
Hide resolved
buildSrc/src/main/groovy/io/nextflow/gradle/tasks/GithubRepositoryPublisher.groovy
Outdated
Show resolved
Hide resolved
plugins/build.gradle
Outdated
indexUrl = System.getenv('PLUGINS_INDEX_JSON') ?: 'https://github.com/nextflow-io/plugins/main/plugins.json' | ||
repos = allPlugins() | ||
owner = github_organization | ||
githubUser = github_username | ||
githubEmail = github_commit_email | ||
githubToken = github_access_token | ||
owner = System.getenv('GH_ORG') ?: 'nextflow-io' | ||
githubUser = System.getenv('GH_USER') ?: project.findProperty('github_username') | ||
githubEmail = System.getenv('GH_USER_EMAIL') ?: project.findProperty('github_commit_email') | ||
githubToken = System.getenv('GH_TOKEN') ?: project.findProperty('github_access_token') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was thinking how to avoid the proliferation of usage of env variables in the Gradle build. What is the main.sh
write a temporary gradle.properties
file mapping the variables to the corresponding gradle properties ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a go at using a temporary gradle.properties
but I don't think it's a good idea. It seems like it would be complex and fragile for very minimal benefit.
Ideally:
- running in CI, the release should use the values from github (ie the env vars)
- running locally, any values from
$GRADLE_USER_HOME/gradle.properties
should take priority - we don't want to risk accidentally committing secrets to the repo
- we should try to keep things relatively simple to understand
One option would be to write the env vars into a gradle.properties
in the project root. This would make them available to gradle, but anything in $GRADLE_USER_HOME/gradle.properties
would still take priority thanks to gradle's built-in precedence rules. To make this safe, we'd need to add gradle.properties
to .gitignore
. But there's already a committed gradle.properties
in the project so we'd have to modify it and then somehow guarantee that it would also remove any secrets from it (including in failure scenarios) to avoid accidentally committing the changes - which is much too risky.
Another option would be to write to properties file with a different name (eg .gradle.release.properties
) which we could explicitly .gitignore
and load into the gradle build if detected. The problem with this is I think those properties would then override anything from $GRADLE_USER_HOME
so we'd have to add logic to only load properties which don't already exist, or only create the properties file if running in CI - at which point it seems like a lot of unecessary complexity.
I think we should also take care to try to keep release/main.sh
as just an orchestration script and ensure that the sub-scripts can be run independently rather than assuming they will always run as part of a main script. This will be important for testing/debugging, fixing release errors, and for future flexibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think it's worth switching the priority order in the gradle scripts though, so that it looks for a gradle property first and then falls back to an env var:
githubUser = project.findProperty('github_username') ?: System.getenv('GH_USER')
This way any values in $GRADLE_USER_HOME/gradle.properties
will take precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option would be to write the env vars into a gradle.properties in the project root.
How? or do you mean using only gradle properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was just thinking through options for ways to make main.sh
write a temporary gradle properties file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be we should reverse the logic and keep all params (except secrets) in the project gradle.properties
.
Those could be "imported" in the shell script with basic helper, e.g. (courtesy chatgpt)
# Read the property file line by line
while IFS="=" read -r key value; do
# Skip empty lines and lines starting with # (comments)
[[ -z "$key" || "$key" == \#* ]] && continue
# Convert the key: replace '.' with '_' and convert to uppercase
env_var_name=$(echo "$key" | tr '.' '_' | tr '[:lower:]' '[:upper:]')
# Export the variable as an environment variable
export "$env_var_name=$value"
echo "Exported: $env_var_name=$value"
done < "$PROPERTY_FILE"
Secrets could be kept both in the user gradle properties or env vars. GitHub actions secrets could be easily added to project gradle.properties.
Bit clunky but it could make the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to implement and test this approach using gradle properties. It does work, but it felt quite convoluted and confusing for a pretty small benefit. Especially when running in CI: the github action writes all the relevant env vars to a gradle properties file, then immediately the release scripts turn those properties back into the same env vars!
On balance, I don't think it's worth it given that the primary goal here is to make the release run in CI, and that running things locally should only be a fallback for debuggging/resolving issues. My suggestion would be that when running locally we can just create a bash file containing the required env vars which can be sourced into a local shell.
I did swap the priority of loading values in the gradle scripts though - now they will look for gradle properties first and env vars second.
(and do some testing of the github action/scripts) Signed-off-by: Tom Sellman <[email protected]>
efbb785
to
95593a9
Compare
General design
To try to keep things understandable and maintainable, this change takes the following approach:
I've also tried to make some simplifications:
Usage
make release
command locallyGithub Action
The approach is to use Github Actions for orchestration, but keep the execution details in bash scripts. The Github Actions structure (Workflow > Job > Step) gives us some flexibility for how to approach the orchestration:
Add a job to the existing
build.yml
workflow, or create a separaterelease.yml
workflowThe release process is quite complicated, with multiple artifacts being deployed to multiple different locations/systems. Given that the existing build workflow is already quite long and complex, a separate
release.yml
workflow file is used.A single job, or multiple jobs
Github will automatically run jobs in parallel where possible, whereas running multiple steps in parallel is a bit more awkward. Jobs can also specify dependencies on each other, which Github will use to draw a nice diagram of the workflow:
Jobs are more independent than steps, meaning splitting the workflow into jobs more cleanly separates the different release tasks and the credentials/artifacts required for each one, making the process easier to understand.
This approach does require a bit more yaml boilerplate: each job needs to checkout the code, initialise tooling, etc. Structuring the workflow as a single job would remove some of the yaml boilerplate, but also removes the nice diagram and makes it bit less clear what dependencies and credentials are required for some steps.
Some alternative Github Actions structures are shown in this draft PR and this other draft PR
Workflow trigger
Following the current project convention, the release workflow is triggered by a commit with message containing the text
[release]
. The commit is created by themake release
command.I couldn't find a way to create a workflow-level filter based the commit message, only to filter the first job. This would mean the action would still "run" on every push, but then skip all the jobs, which would create a lot of empty skipped workflows in Github.
Instead,
release.yml
uses aworkflow_dispatch
trigger which is explicitly activated by a 'Release' job in the existingbuild.yml
when a[release]
commit is detected, similar to the Wave process.Limitations
These changes don't represent full automation of nextflow releases. This is more like a headless version of the build and deploy steps from the current manual release process. It still requires a number of manual steps, notably:
Before running
make release
:VERSION
file must be updated manuallynextflow
launch script must also be manually updated to match theVERSION
filechangelog.txt
file must be manually updated with change notesMANIFEST.MF
file must be manually updated with the new versionchangelog.txt
file must be manually updated with change notesAfter the release workflow:
changelog.txt
to the automatically created draft Github releaseAs part of this change, jars are no longer published to maven central - only to the seqera maven repository.
The
build-info.properties
file contains build metadata including timestamp and commit id and is currently committed to the repo - meaning that always contain incorrect data (ie the commit id before[release]
). To improve the accuracy of the metadata in the released Nextflow jars, this release workflow re-generates the build-info file when assembling them. However it doesn't commit it back to the repo since this would create an awkward second release commit after the[release]
one.The existing docker image build downloads the Nextflow runtime using the launcher rather than copying in the assembled artifacts. I'm not sure why this is, so have left it unchanged for now - but it does currently impose a strict ordering requirment between uploading the jars to S3 and building the docker image.
Failure recovery
Rather than implement complex retry/force logic in the workflow or try to anticipate all the possible failure scenarios, an additional strength of using small individual shell scripts for each task is that any specific failure can be debugged and re-run manually as needed. In order to avoid rebuilding the artifacts, they can be downloaded from the Github workflow summary page.
This should be especially useful during the initial transition from manual to automated release while we iron out kinks in the process.
Configuration
This release workflow requires a number of repository variables and secrets to be configured. Although some of them seem to duplicate existing secrets use by the CI build (eg
AWS_ACCESS_KEY_ID
vsAWS_DEPLOY_ACCESS_KEY_ID
), my recommendation would be to use different variables/secrets to better control the different permissions required for a CI build vs a release. Different Github environments could also be used on the project to restrict the release workflow and credentials to specifc branches.TODO list all the required secrets/variables and their values/permissions.
Future enhancements
make release
script to check that the versions match each other, etcbuild-info.properties
from the repo, instead only generating during release and not committing[release]
commit inside the release workflow, fully automating the process