Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make LSODE integrator thread-safe in OpenMP parallel loops by adding THREADPRIVATE statements #121

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

yantosca
Copy link
Contributor

Overview

This is the companion PR to #118. We have now made the LSODE integrator thread-safe by adding $OMP THREADPRIVATE statements to the following variables:

  1. RWORK in routine KppLsode (in int/lsode.f90);

    • This was also declared with SAVE to make it a global variable
  2. IWORK in routine KppLsode (in int/lsode.f90)

    • This was also declared with SAVE to make it a global variable
  3. All instances of common block /DLS001/ throughout int/lsode.f90

Validation

I used our fork of the KPP-Standalone, and modified the kpp_standalone.F90 driver program to include an OpenMP loop:

    !$OMP PARALLEL DO             &
    !$OMP DEFAULT( SHARED       ) &
    !$OMP PRIVATE( N, TIN, TOUT )
    DO N = 1, 10

       !$OMP CRITICAL
       print*, '### instance ', N
       !$OMP END CRITICAL

       ! Set ENV
       TIN          = T
       TOUT         = T + OperatorTimestep

       ! Set initial concentrations (C) and reacton rates (RCONST)
       ! to values read from the input file
       C            = Cinit
       RCONST       = R

       ! Integrate the mechanism for an operator timestep
       CALL Integrate( TIN, TOUT, ICNTRL, RCNTRL, ISTATUS, RSTATE, IERR )

    ENDDO
    !$OMP END PARALLEL DO

This will perform an integration on the same initial conditions 10 times. The C array is overwritten by each integration. This is sufficient to test if there is a parallelization error.

When compiled with OpenMP parallelization turned off, the integrations are done in order from 1..10 and we get this screen output:

$ ./kpp_standalone.exe Beijing_L1_20190701_0040.txt ref.log
 Processing sample: Beijing_L1_20190701_0040.txt
 Output file: ref.log
 ### instance            1
 ### instance            2
 ### instance            3
 ### instance            4
 ### instance            5
 ### instance            6
 ### instance            7
 ### instance            8
 ### instance            9
 ### instance           10
 Number of internal timesteps (from 3D run):    12
 Number of internal timesteps ( standalone):    92
 Hexit (from 3D run):     497.80
 Hexit ( standalone):      80.40
Warning: Number of internal steps do not match 3D grid cell
Warning: final timestep does not match 3D grid cell within 0.1%

final concentrations are written to the ref.log file. The difference in Hexit is because the initial conditions came from a run using the Rosenbrock integrator.

Then I recompiled the executable with -fopenmp , and we get this screen output:

$ ./kpp_standalone.exe Beijing_L1_20190701_0040.txt devmp.log
 Processing sample: Beijing_L1_20190701_0040.txt
 Output file: devmp.log
 ### instance           10
 ### instance            5
 ### instance            6
 ### instance            2
 ### instance            4
 ### instance            3
 ### instance            9
 ### instance            7
 ### instance            1
 ### instance            8
 Number of internal timesteps (from 3D run):    12
 Number of internal timesteps ( standalone):    92
 Hexit (from 3D run):     497.80
 Hexit ( standalone):      80.40
Warning: Number of internal steps do not match 3D grid cell
Warning: final timestep does not match 3D grid cell within 0.1%

Here you can see the instances are being done in non-sequential order, as the parallel loop is active. We get identical Hexit as the run with parallelization turned off. Also, the output in devmp.log was identical to ref.log as determined by a diff command.

@yantosca yantosca added integrators Related to numerical integrators bugfix Fixes a bug or a technical issue labels Dec 20, 2024
@yantosca yantosca added this to the 3.2.0 milestone Dec 20, 2024
@yantosca yantosca self-assigned this Dec 20, 2024
Copy link
Contributor

@RolfSander RolfSander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking me to review this. However, I don't have much
experience with OpenMP and "thread-safe" coding, so I will not be able
to contribute much here.

@yantosca yantosca linked an issue Dec 20, 2024 that may be closed by this pull request
@yantosca
Copy link
Contributor Author

I spoke too soon. Compiling with GNU 12.2.0 resulted in this error:

[ 68%] Building Fortran object src/GEOS-Chem/KPP/fullchem/CMakeFiles/KPP.dir/gckpp_Integrator.F90.o
/n/netscratch/jacob_lab/Lab/ryantosca/tests/kpptests/lsode/test_lsode/CodeDir/src/GEOS-Chem/KPP/fullchem/gckpp_Integrator.F90:254:27:

  254 |       INTEGER :: IWORK(LIW), IPAR(1), ITOL, ITASK,         &
      |                           ^
Error: ‘iwork’ causes a section type conflict with ‘rwork’
/n/netscratch/jacob_lab/Lab/ryantosca/tests/kpptests/lsode/test_lsode/CodeDir/src/GEOS-Chem/KPP/fullchem/gckpp_Integrator.F90:253:33:

  253 |       REAL(kind=dp) :: RWORK(LRW), RPAR(1)
      |                                 ^
note: ‘rwork’ was declared here

I will look into this further.

Copy link
Member

@jimmielin jimmielin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yantosca for these changes (and the helpful comments)! I am not sure about the section type conflict error or what could be causing it, so unfortunately I cannot help on that.

@yantosca
Copy link
Contributor Author

yantosca commented Jan 2, 2025

@jimmielin: I wasn't able to make much headway with the section-type error. I decided to replace the COMMON /DLS001/ block with a derived-type object that is held !$OMP THREADPRIVATE. Still shaking down some more issues but at least this approach doesn't incur the section-type error. Will request a re-review once I get it working.

@yantosca
Copy link
Contributor Author

yantosca commented Jan 7, 2025

@jimmielin @RolfSander: I have been coming up against a wall trying to resolve the Gfortran "section type conflict" error. I've tried removing the COMMON /DLS001/ block and moving those variables into internal modules to make them threadprivate but that hasn't solved it.

I think the problem might be that the RWORK array is too large to fit in stack memory (where THREADPRIVATE arrays must go). LSODE uses a pretty large dimension for RWORK:

KPP/int/lsode.f90

Lines 217 to 221 in cd03fe0

INTEGER, PARAMETER :: LRW = 25 + 9*NVAR+2*NVAR*NVAR, &
LIW = 32 + NVAR
KPP_REAL :: RWORK(LRW), RPAR(1)
INTEGER :: IWORK(LIW), IPAR(1), ITOL, ITASK, &
IERR, IOPT, MF

For the GEOS-Chem mechanism NVAR is ~320 so LRW would be ~207k elements, which would be 1.656 GB.

I may need to put aside this PR for the time being. @jimmielin, do you have any other ideas I could try?

@yantosca
Copy link
Contributor Author

yantosca commented Jan 7, 2025

@jimmielin: I should say that LSODE fails when I run it in GEOS-Chem Classic but not in the standalone box model. I think that is due to memory. I can try to replicate the issue in the box model by increasing the number of iterations.

int/lsode.f90
- Removed obsolete SDIRK coefficients
- Declared LIW, LRW as global parameters
- Now declare RWORK and IWORK as local arrays to routine DLSODE
- Add additional input and output arguments to DLSODE so as to
  remove input/output values from RWORK and IWORK
- Added THREADPRIVATE statements to COMMON /DLS001/ for OpenMP
- Trimmed trailing whitespace

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca force-pushed the bugfix/add-threadprivates-to-lsode branch from 2053776 to f6060fa Compare January 9, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a bug or a technical issue integrators Related to numerical integrators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG/ISSUE] LSODE integrator is not thread-safe
3 participants