Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contents for section "Performance aspects of CernVM-FS" #8

Closed
boegel opened this issue Jul 7, 2023 · 5 comments
Closed

contents for section "Performance aspects of CernVM-FS" #8

boegel opened this issue Jul 7, 2023 · 5 comments
Assignees

Comments

@boegel
Copy link
Collaborator

boegel commented Jul 7, 2023

No description provided.

@boegel
Copy link
Collaborator Author

boegel commented Jul 7, 2023

To assess startup performance, we can look into a single-binary as a base case (HPL?), a typical scientific app (OpenFOAM), and a large Python app (TensorFlow).

For OS jitter, OpenFOAM is a good use case, since it's known to be quite sensitive to OS jitter (if just one core is temporarily busy with something else than running OpenFOAM, the whole multi-node run will be significantly slower).

@HereThereBeDragons
Copy link
Collaborator

if you dont have a benchmark setup yet, i would have some python scripts.. but they need a bit of cleanup first.
eventually we want to use them in some form to have a continous performance testing.
but let me know if you'd prefer using those or use your own solution

@ocaisa
Copy link
Collaborator

ocaisa commented Jul 7, 2023

Right now we don't have much, performance at different scales (getting the files into the cache, and then from cache to load) is only something we've just started to look at.

As regards getting data into the cache, we have a bash script that measures the performance of our public S1's: EESSI/eessi-demo#24 . We are keen to see what kind of performance CDNs can deliver for us.

From cache to load, we have nothing, so anything you have would be a welcome starting point.

@HereThereBeDragons
Copy link
Collaborator

i opened up a PR with a cleaned up version of the performance benchmark i used for chep: cvmfs/cvmfs#3372
while for sure it will still change, e.g. accepting command line arguments, it is already now fully functional. if you already want to run know some benchmarking tests.
all you need to modify are the user params in start_benchmark.py and start_visualization.py (and read the comments there)
all code expects to be run from cvmfs/test/performance-benchmark
workflow:

  1. python3 start_benchmark.py --write to--> ./data/<maybe_some_subdir>/*.csv
  2. ./data/<maybe_some_subdir> --takes--> python3 start_visualization.py --writes-to--> ./results/<maybe some subdir>/*.pdf
    (you can set the outdir to something else if you want)

@boegel
Copy link
Collaborator Author

boegel commented Nov 20, 2023

I have puzzled together some stats on startup performance of:

Still work-in-progress, because not all tests were done with the same Python/TensorFlow versions, but the results look pretty good, seems like they will definitely support the narrative we have in mind for this section.

@boegel boegel closed this as completed Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants