Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added experimental background job queue #20985

Merged
merged 37 commits into from
Nov 4, 2024
Merged

Added experimental background job queue #20985

merged 37 commits into from
Nov 4, 2024

Conversation

9larsons
Copy link
Contributor

@9larsons 9larsons commented Sep 12, 2024

ref https://linear.app/tryghost/issue/ENG-1556/

  • added background job queue behind config flags
  • when enabled, is only used for the member email analytics updates in order to speed up the parent job, and take load off of the main process that is serving requests

Added config flags for the job queue to make it switchable

config options are in services:jobs:queue and are 'enabled,reportStats,reportInterval,maxWorkers,logLevel(info/debug),pollMinInterval,pollMaxInterval,queueCapacity,fetchCount.
Copy link
Contributor

github-actions bot commented Sep 12, 2024

It looks like this PR contains a migration 👀
Here's the checklist for reviewing migrations:

General requirements

  • Satisfies idempotency requirement (both up() and down())
  • Does not reference models
  • Filename is in the correct format (and correctly ordered)
  • Targets the next minor version
  • All code paths have appropriate log messages
  • Uses the correct utils
  • Contains a minimal changeset
  • Does not mix DDL/DML operations
  • Tested in MySQL and SQLite

Schema changes

  • Both schema change and related migration have been implemented
  • For index changes: has been performance tested for large tables
  • For new tables/columns: fields use the appropriate predefined field lengths
  • For new tables/columns: field names follow the appropriate conventions
  • Does not drop a non-alpha table outside of a major version

Data changes

  • Mass updates/inserts are batched appropriately
  • Does not loop over large tables/datasets
  • Defends against missing or invalid data
  • For settings updates: follows the appropriate guidelines

@github-actions github-actions bot added the migration [pull request] Includes migration for review label Sep 12, 2024
This is slightly less efficient - potentially - because we do a select before inserting instead of ignoring the insert conflict. It's likely less efficient because I don't anticipate a ton of duplicates, although the analytics job can certainly result in that, which is where I'd expect the knex implementation to slightly win out.

Regardless, testing is a fucking nightmare with knex directly as we have to spin up a db and use a schema for the table. Let's go down that path later if we need the performance improvements.
@9larsons
Copy link
Contributor Author

Need to add a couple integration tests then we ought to be ok. Not sure what the failure for Ghost-CLI is.

@vikaspotluri123
Copy link
Member

It looks like there's an error that's preventing the persisted logs from being printed 🙃 Here's the culprint:

Ghost was able to start, but errored during boot with: Cannot find module 'workerpool'

Looking at the lockfile changes, I'm not sure if they're intentional - there are a lot of new dependencies, which I think might stem from an older version of @tryghost/errors
Also, the real issue is that it looks like workerpool wasn't added as a dependency?

@9larsons
Copy link
Contributor Author

Ok, clearly there's some kind of race condition with the config mocks and initializing the service. I will have to look into that when I can reprioritize this.

It appears testUtils.setup MUST be called as before(testUtils.setup()); and doesn't work as an anonymous function with any consistency.
This was failing locally due to the local test db (in /tmp/ghost-test.db) missing migration columns. Not sure why. It could be due to version mismatch at this point as the migration needs to be updated before merging this. Blowing away that file and then re-running init'd a table with the correct schema.
@9larsons
Copy link
Contributor Author

@vikaspotluri123 Hey, it seems like the ghost-cli tests are hanging on restart with this. Do you have any suggestions on how to repro/troubleshoot that? Only thing I can guess is that the job queue service is hanging. It doesn't do that to me locally.

@9larsons
Copy link
Contributor Author

Note to self: need to update migration to new version before merging.

@vikaspotluri123
Copy link
Member

It took me a minute, but taking a look 👀

https://github.com/vikaspotluri123/Ghost/actions/runs/11393243539/job/31701108553?pr=14

@vikaspotluri123
Copy link
Member

vikaspotluri123 commented Oct 18, 2024

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index eec2c09aeda..d68f13ce3a2 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -861,7 +861,8 @@ jobs:
 
       - name: Update from v4
         run: |
-          ghost update -f -d $V4_DIR --archive $(pwd)/ghost/core/ghost.tgz
+          ghost update -f -d $V4_DIR --archive $(pwd)/ghost/core/ghost.tgz --no-restart
+          ghost run -d $V4_DIR -D
 
       - name: Save Ghost CLI Debug Logs
         if: failure()
@@ -879,7 +880,8 @@ jobs:
         run: |
           DIR=$(mktemp -d)
           ghost install local -d $DIR
-          ghost update -d $DIR --archive $(pwd)/ghost/core/ghost.tgz
+          ghost update -d $DIR --archive $(pwd)/ghost/core/ghost.tgz --no-restart
+          ghost run -d $DIR -D
 
       - name: Print debug logs
         if: failure()

I hacked the job like that and found this

[2024-10-18 00:32:03] ERROR Unhandled rejection: insert into `jobs` (`created_at`, `finished_at`, `id`, `metadata`, `name`, `queue_entry`, `started_at`, `status`, `updated_at`) values ('2024-10-18 00:32:03', NULL, '6711ac8389d4880ed0fcad4e', NULL, 'members-migrations', NULL, NULL, 'queued', '2024-10-18 00:32:03') - SQLITE_ERROR: table jobs has no column named metadata

insert into `jobs` (`created_at`, `finished_at`, `id`, `metadata`, `name`, `queue_entry`, `started_at`, `status`, `updated_at`) values ('2024-10-18 00:32:03', NULL, '6711ac8389d4880ed0fcad4e', NULL, 'members-migrations', NULL, NULL, 'queued', '2024-10-18 00:32:03') - SQLITE_ERROR: table jobs has no column named metadata
Error Code: 
    SQLITE_ERROR
----------------------------------------
Error: insert into `jobs` (`created_at`, `finished_at`, `id`, `metadata`, `name`, `queue_entry`, `started_at`, `status`, `updated_at`) values ('2024-10-18 00:32:03', NULL, '6711ac8389d4880ed0fcad4e', NULL, 'members-migrations', NULL, NULL, 'queued', '2024-10-18 00:32:03') - SQLITE_ERROR: table jobs has no column named metadata

[2024-10-18 00:32:03] INFO Ghost URL Service Ready in 3.509s

@9larsons
Copy link
Contributor Author

Ah awesome, thank you. I suppose that's a good thing! I knew the migration needed updating but it's nice to know that it'll fail in CI. Should've thought of that.

@9larsons 9larsons requested a review from cmraible October 31, 2024 15:43
Copy link
Collaborator

@cmraible cmraible left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration looks good to me! Have some more general thoughts which I'll share in slack, but definitely nothing blocking — looking forward to trying this out in staging 🚀

@9larsons 9larsons merged commit 88db66a into main Nov 4, 2024
21 checks passed
@9larsons 9larsons deleted the add-job-queue branch November 4, 2024 16:13
tilak999 pushed a commit to tilak999/ghost that referenced this pull request Nov 20, 2024
ref https://linear.app/tryghost/issue/ENG-1556/
- added background job queue behind config flags
- when enabled, is only used for the member email analytics updates in
order to speed up the parent job, and take load off of the main process
that is serving requests

The intent here is to decouple certain code paths from the main process where it is unnecessary, or worse, where it's part of the request. Primary use cases are email analytics (particularly the member stats [open rate]) which are not particularly helpful in the period immediately following an email send, while the click traffic and delivered/opened events are.

Related, the email link clicks themselves send off a cascade of events that are quite a burden on the main process currently and are somewhat tied to the request response when they needn't be. We'll be looking to tackle that after some initial testing with the email analytics job.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migration [pull request] Includes migration for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants