-
Notifications
You must be signed in to change notification settings - Fork 7
VSC Reframe Meeting 2022 03 10
Michele Pugno edited this page Mar 10, 2022
·
2 revisions
- Franky
- Kenneth
- Maxime
- Michele
- Robin
- Sam
- Steven
- folder structure and recursive launch file
- tools availability and version test
- basic environment variable availability and correctness test
- shared FSs test (Not merged yet)
- mpi hello world
- works on: Hydra (VUB), Vaughn, Hortense, BrENIAC?
-
best practices
- create a small readme.md inside each test folder.
-
naming conventions
- tests that require loading specific modules
- difference in module naming
- example:
module load Python/3 (or Python3)
should work everywhere- should load latest version in production
- via symlink
-
get_module_name
function to get module name for OpenFOAM (which could be tweaked per site)- could give flexibility to run tests with different modules (like MPI tests with a range of toolchains)
-
get_module_name
function could pick up on environment variables likeVSC_TES_SUITE_MPI_MODULE=...
- custom "partition" in ReFrame configuration
- also specifies environment, resource manager, MPI launcher
- wrong module names only trigger warnings (stderr stream out) in ReFrame (to test failures)
-
system tools
- which tools should be available system-wide vs through modules?
- for some tools (git, tmux) more recent versions can be useful
- Singularity should never be installed through modules (create a test that check singularity is not isntalled as a module)
- EasyBuild should always be installed as a module
- Mercurial installed as a module requires python-devel OS package?
- tools installed as modules should be installed with
system
toolchain (so they can be used along with any other modules) - no distinction between installed on login node vs workernodes
- should also be clear from CUE working group
- columns for common set of tools:
- list of tools, system-wide vs module (w/ system toolchain), login vs workernode, version req.
- minimal set of software per toolchain generation: Python, ...
- (Kenneth) Apptainer vs Singularity
- Apptainer will eventually replace Singularity in EPEL package repo
- Apptainer will still provide
singularity
command - update to Apptainer will require renaming configuration file from
/etc/singularity.conf
to/etc/apptainer.conf
-
env var tests
- How should variables like $VSC_DATA_VO should be handled?
- Test should have logic to ensure it is set (correctly) when it should be set (like UGent VSC account in non-default VO)
- User based env test according to the site where the user is logged in
-
$VSC_DATA_VO
should always be set to/does/not/exist
for sites that don't use VO concept- having it set to an incorrect value is better than not defining it (think use of
$VSC_DATA_VO
in job scripts that are shared across sites)
- having it set to an incorrect value is better than not defining it (think use of
- $VSC_ARCH_LOCAL: how should this be checked?
- via
archspec
- how should
archspec
be available?- as a part of ReFrame installation?
- this makes most sense?
- archspec is an optional dependency for ReFrame
-
reframe --detect-host-topology
the command uses https://github.com/archspec/archspec - should be easy to change this in ReFrame easyconfig file
- isn't that installed automatically while executing the bootstrap during indtallation?
- as system-wide installation (part of CUE)?
- as a module (part of CUE)?
- as a part of ReFrame installation?
- via
- How should variables like $VSC_DATA_VO should be handled?
-
Stick to Reframe 3.10.1 VSC wide
- goal should be to clean up
run.sh
script to avoid site-specific things in there - ReFrame installed as a module should be part of CUE
- keep up with ReFrame versions as they are released: pull request to bump ReFrame version in
run.sh
script to agree on version bump - need to set
$SBATCH_ACCOUNT
on Tier-1 Hortense can be avoided byletting script that sets up environment do that automatically (when you only have 1 active project), if using environments to describe it inside config file be careful to specify correct target systems in order to avoid overlap between environs with the same name (e.g. single-node)configuring the environment variable inside a partition in the config file(automatically works for the correct system)
- goal should be to clean up
-
shared FS test is WIP
- PR is open by Robin
- PR is being discused
- should permissions also be checked?
- maybe first version of this test should only check dir existence, permission check can be added later
- policy for directory permissions should be agreed in CUE?
-
putting test suite in hands of users
- especially relevant for CUE tests
- let users check whether their account adher to current environment
-
can CUE working group on a common way to check storage quota?
-
myquota
(GJB) myquota --onlyscratch
- Steven only sees scratch quota on Vaughan with his VSC account
-
my_dodrio_quota
(Hortense)
-
-
(Mantainability) tests with parameters have long folder names, can we fix this?
- changing
name
via attribute of tests is deprecated, will probably no longer be possible in ReFrame 4.x - reach out to ReFrame developers on this via Slack or community meetup?
- changing
-
Use branches or forks?
- open PRs from a branch in your own fork
- allow pushes in your fork from vsc team members
- Finish first iteration over FS test (Robin, Michele)
- Upgrade to Reframe 3.10.1, better run.sh script, Add envvar to partitions in the config file. (Everyone)
- Second iteration over envar test (Steven, Samuel)
- Add one app test (e.g. numpy) (Michele, Samuel)
- next meeting: Thu 21st April 2022