Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: calculate analysis task graphs in parallel via dask.delayed #1019

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

lgray
Copy link
Collaborator

@lgray lgray commented Jan 31, 2024

This is a workaround for the large amount of time it can take to calculate all the task graphs for a large and mature analysis.

@lgray lgray changed the title feat: provide a feature to calculate analysis taskgraphs in parallel via dask.delayed feat: calculate analysis task graphs in parallel via dask.delayed Jan 31, 2024
@lgray lgray marked this pull request as draft February 1, 2024 01:22
@lgray lgray force-pushed the build_taskgraph_in_dask branch from d9b7403 to 7af15c0 Compare February 1, 2024 20:19
@lgray lgray marked this pull request as ready for review February 1, 2024 22:26
@lgray lgray requested a review from nsmith- February 1, 2024 22:47
@lgray lgray force-pushed the build_taskgraph_in_dask branch 5 times, most recently from fa1399a to aafec0a Compare February 8, 2024 09:08
@lgray lgray force-pushed the build_taskgraph_in_dask branch 4 times, most recently from 6b9cda8 to b958664 Compare February 17, 2024 16:20
@lgray lgray force-pushed the build_taskgraph_in_dask branch 3 times, most recently from 7ce5da1 to 837b75b Compare February 24, 2024 17:25
@lgray lgray force-pushed the build_taskgraph_in_dask branch 4 times, most recently from 47ae50f to ec87437 Compare March 6, 2024 21:11
@lgray
Copy link
Collaborator Author

lgray commented Mar 6, 2024

The memory footprint in dask is particularly odd right now, it's very large. While a single task graph is very manageable in memory for a single thread.

I may switch this to be local-only and use a ProcessPoolExecutor to parallelize the processing.

@lgray lgray force-pushed the build_taskgraph_in_dask branch 3 times, most recently from b16e104 to 4c5ef04 Compare March 13, 2024 03:21
@lgray lgray force-pushed the build_taskgraph_in_dask branch from 4c5ef04 to 8b60862 Compare March 19, 2024 01:50
@lgray lgray force-pushed the build_taskgraph_in_dask branch 2 times, most recently from 9b99b51 to c31e279 Compare April 2, 2024 15:17
@lgray lgray force-pushed the build_taskgraph_in_dask branch from c31e279 to e0ff083 Compare May 8, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant