-
Notifications
You must be signed in to change notification settings - Fork 40
RunTime CRAB vs WMCore
CRAB and WMCore both need a run time environment for running user payload in grid node. Purpose of this short doc is to write down main design points, reflecting how things are implemented in WMCore in 2022, so that we can derive how to do the same in CRAB and how to share code.
References:
- https://github.com/dmwm/WMCore/wiki/Notes-about-environment-variables-passed-to-the-Scram-environment-or-modified-when-running-the-CMSSW-executable
- https://github.com/dmwm/WMCore/wiki/List-of-PSet-tweaks-applied-by-WMAgent-during-job-runtime
- https://github.com/dmwm/CRABServer/issues/7084
- https://github.com/dmwm/WMCore/issues/10257
- https://github.com/dmwm/WMCore/issues/11104
- Jobs start in a singularity image prepared by the gWMS pilot
- Jobs use the classAd REQUIRED_OS to tell gWMS which image to start
- this is to decide
rhel6
vsrhel7
etc. so is going to beel8
for all CMSSW_12 and up - Jobs use
TARGET_ARCH
to select the "hardware" e.g. x86 vs. ppc vs arm etc. - Jobs will find in the image the env setup by the pilot plus whatever site deemed useful/needed locally: the job start environment
- jobs will run the WMA/CRAB wrapper scripts in the start environment AFTER having sourced the correct COMP environment for that image (i.e. OS + arch) from /cvmfs/cms.cern.ch/COMP ( wrapper environment)
- Jobs will run the payload in the start environment AFTER having setup SCRAM environment. i.e. mimicking what a user would do:
- start the image
- source /cvmfs/cms.cern.ch/cmsset_default.sh
- cmsrel CMSSW_X_Y_Z
- cmsenv
- cmsRun -p pset.py -j fjr.xml Even if actual instructions varies (e.g. use of locally defined VO_CMS_SW_DIR, OSG_APP etc.) setup is assumed to be equivalent
- If creation/manipulation of the pset is needed, it will be done using edm utils which run in an EMPTY environment + SCRAM.
- Stageout script run in the the start environment AFTER having sourced the correct COMP environment, i.e the same environment as the job wrapper
- have a common script to source COMP environemt [1]
- have a common way to fork subprocesses in either the
- EMPTY environment + SCRAM [2]
- the start environment + SCRAM env. [3]
[1] this needs to be developed (small adaptation of WMCore's submit.sh see https://github.com/dmwm/WMCore/issues/10257 )
[2] currently done using WMCore's Scram()
with cleanEnv=True
(the default) - ALL OK
[3a] currently done in WMCore by forking a process in the wrapper environment where the first action is unset PYTHONPATH
(used for removing WMCore.zip from the python path. It is still not clear if we need to unset the pythonpath as well, since we are doing something similar ) and "the second" is cmsenv
[3b] currently done in CRAB by Scram(cleanEnv=True)
+ a few ad-hoc env. var. (like X509_USER_PROXY
)
[1] IMHO should be done. Period.
[2] is fine
[3a] I think that unset PYTHONPATH
is fragile and it would be better to have an 'unset' command for COMP which can be upgraded as needed in the future w/o touching the wrappers code (a bit like scram unsetenv
)
[3b] At this point is clear that it is wrong and I think that we need to replace with something like [3a] but would prefer to have the unsetenv
also to minimize/eliminate any place in CRAB where we replicate "what WMA does" instead of "using WMA code". I would also much rather have a way to customize the env in Scram()
then fork a process where I do unset + cmsrel + cmsenv
.. but need to hear from WMCore developers before proposing changes to Scram, see https://github.com/dmwm/WMCore/blob/eba0a315ed973616357e231976f7092adcb6b2e6/src/python/WMCore/WMRuntime/Tools/Scram.py#L328
- hmmm.. maybe CRAB can use this https://github.com/dmwm/WMCore/blob/eba0a315ed973616357e231976f7092adcb6b2e6/src/python/WMCore/WMRuntime/Tools/Scram.py#L199 ?
Latest changes: https://github.com/dmwm/CRABServer/releases/tag/v3.230220
Current CRAB status:
- (1) CRAB and WMCore can share the script https://github.com/dmwm/CRABServer/blob/master/scripts/submit_env.sh
- We currently run our jobs in "startup env + comp + scram(cleanenv=false)"
- (3) Dario is not sure if this actually required