[nightly] Create Nightly Pipeline, make docker-nightly-publish.yml & integration.yml more modular #2628

HappyAmazonian · 2024-12-09T20:44:37Z

Description

This PR includes the following changes

docker-nightly-publish.yml now pushes image to ECR only
docker-nightly-publish.yml add arch arg
integration.yml add workflow_call for reusing in future
integration.yml add tag-suffix & envs to allow the tests to fetch the image from ECR
tests/integration/tests.py now it uses ECR image to perform the tests
docker_publish.yml syncs images in the temp ECR to the staging ECR repo and dockerhub
nightly.yml introduce a pipeline of build,integtest,publish nightly images

If this change is a backward incompatible change, why must this change be made?

docker-nightly-publish now only pushes to ECR.
nightly.yml will handles build, integ test, and push to the staging ECR/dockerhub.
you can use docker-nightly-publish to build the image and the built image will be in the temp ECR repo.

Type of change

Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)

Feature/Issue validation/testing

I have created one run here
integ test: https://github.com/deepjavalibrary/djl-serving/actions/runs/12360671952/job/34496566447
nightly build: https://github.com/deepjavalibrary/djl-serving/actions/runs/12401201241/job/34620194153
docker publish: https://github.com/deepjavalibrary/djl-serving/actions/runs/12400140451

nightly pipeline run: https://github.com/deepjavalibrary/djl-serving/actions/runs/12399112958/job/34622823684

Lokiiiiii · 2024-12-10T05:19:30Z

Pushing and Pulling the built container to transfer it between hosts in itself takes >10 mins. I would like to avoid this if possible.

I added a persistent FSX cache volume to all the self hosted runners. Can you try to use the cache to persist data between runners instead of using ECR ? Maybe this can save us some time. I assume we would run into the same issue with PR sanity tests.

.github/workflows/integration.yml

.github/workflows/docker-nightly-publish.yml

siddvenk · 2024-12-16T21:23:43Z

.github/workflows/docker-nightly-publish.yml

-          ./gradlew --refresh-dependencies :serving:dockerDeb -Psnapshot
-      - name: Build and push nightly docker image
-        if: ${{ inputs.mode == '' || inputs.mode == 'nightly' }}
+      - name: Build release docker image


minor: Let's call this "Build release candidate docker image"

siddvenk · 2024-12-16T21:52:54Z

.github/workflows/docker-nightly-publish.yml

+      - name: Build serving package for nightly
+        if: ${{ inputs.mode == '' || inputs.mode == 'nightly' }}
+        run: |
+          ./gradlew --refresh-dependencies :serving:dockerDeb -Psnapshot


we can move this as a command under Build temp docker image? I don't think we need a separate step for this only

siddvenk · 2024-12-16T21:55:43Z

.github/workflows/docker-nightly-publish.yml

+          tempRunIdTag="${{ env.AWS_ECR_REPO }}:${{ matrix.arch }}-${{ inputs.mode }}-${GITHUB_RUN_ID}"
+          tempCommitTag="${{ env.AWS_ECR_REPO }}:${{ matrix.arch }}-${{ inputs.mode }}-${GITHUB_SHA}"


can we make a slight change here where if inputs.mode == 'release, we use the DJL_VERSION value instead of release?

or maybe for both release/nightly/tmp we always just add the djl version number to the tag

tests/integration/tests.py

changes were addressed as requested

HappyAmazonian added 24 commits December 6, 2024 23:13

comment out something doesn't work for fork

33f14f6

use my iam role

39a9050

add checkout back

c8233a3

use my repo

b96628b

remov ecreate runner

85b15e5

fix-tag

ec5bf65

make everything push to ECR

7505dff

add mode in tag

1f17c17

add condition to push

3a74892

remove blank lin

f63a7fe

add call integration workflow

8d43fb3

remove push for testing

2674540

fix push condition

4133b2e

fix repo name

850a69a

change repo for testing in djl

f182852

fix role

6f01ae3

fix neuron image, disable pytest capture

5ace20e

add docker credential

e974d65

change env

8de85fa

fix neuron docker tag

9808a93

add back aarch build

14185c5

fix for PR

9f4a91a

merge

3dde6de

add back docker push

0adabd0

HappyAmazonian requested review from zachgk and a team as code owners December 9, 2024 20:44

HappyAmazonian added 2 commits December 9, 2024 21:04

fix format

7a75c66

add the missing tag step

c40f9f5

Lokiiiiii previously requested changes Dec 10, 2024

View reviewed changes

.github/workflows/integration.yml Outdated Show resolved Hide resolved

.github/workflows/docker-nightly-publish.yml Outdated Show resolved Hide resolved

HappyAmazonian added 3 commits December 16, 2024 19:27

fix neuron image

03f6e17

fix condition in neuron ut

186e602

fix neuron uri

5d4ac70

HappyAmazonian changed the title ~~[nightly] Trigger Integ test before pushing to dockerhub~~ [nightly] nightly yml now push to ECR repo & integ test can be launched with ECR image Dec 16, 2024

HappyAmazonian added 2 commits December 16, 2024 20:50

fix format

3949dcc

clean

bcd3555

siddvenk reviewed Dec 16, 2024

View reviewed changes

HappyAmazonian added 14 commits December 16, 2024 23:21

fix based on comment

5a9f70c

update default arch value

f396e74

rebase on other pr

33ab630

test docker publish

74b72fa

fix permission

53b0d1c

use sha

26636d7

improve scripts

468a260

fix for loop

b5eaf03

improve code quality

ea3b518

fix path

0faa058

fix multiple typo

ab7e10b

merge

5761450

use credential only for ubuntu

ebc7f89

enable docker push

cf56d95

HappyAmazonian changed the title ~~[nightly] nightly yml now push to ECR repo & integ test can be launched with ECR image~~ [nightly] Create Nightly Pipeline, make docker-nightly-publish.yml & integration.yml more modular Dec 18, 2024

HappyAmazonian and others added 4 commits December 18, 2024 20:32

fix naming

1d6b0e4

echo

583e8a2

Merge branch 'master' into nightly-integ-remodel

d772470

log image under tests for integration tests

ce98891

siddvenk approved these changes Dec 19, 2024

View reviewed changes

siddvenk merged commit 3aebeb5 into master Dec 19, 2024
9 checks passed

siddvenk deleted the nightly-integ branch December 19, 2024 19:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nightly] Create Nightly Pipeline, make docker-nightly-publish.yml & integration.yml more modular #2628

[nightly] Create Nightly Pipeline, make docker-nightly-publish.yml & integration.yml more modular #2628

HappyAmazonian commented Dec 9, 2024 •

edited

Loading

Lokiiiiii commented Dec 10, 2024 •

edited

Loading

siddvenk Dec 16, 2024

siddvenk Dec 16, 2024

siddvenk Dec 16, 2024

		tempRunIdTag="${{ env.AWS_ECR_REPO }}:${{ matrix.arch }}-${{ inputs.mode }}-${GITHUB_RUN_ID}"
		tempCommitTag="${{ env.AWS_ECR_REPO }}:${{ matrix.arch }}-${{ inputs.mode }}-${GITHUB_SHA}"

[nightly] Create Nightly Pipeline, make docker-nightly-publish.yml & integration.yml more modular #2628

[nightly] Create Nightly Pipeline, make docker-nightly-publish.yml & integration.yml more modular #2628

Conversation

HappyAmazonian commented Dec 9, 2024 • edited Loading

Description

Type of change

Feature/Issue validation/testing

Lokiiiiii commented Dec 10, 2024 • edited Loading

siddvenk Dec 16, 2024

Choose a reason for hiding this comment

siddvenk Dec 16, 2024

Choose a reason for hiding this comment

siddvenk Dec 16, 2024

Choose a reason for hiding this comment

HappyAmazonian commented Dec 9, 2024 •

edited

Loading

Lokiiiiii commented Dec 10, 2024 •

edited

Loading