Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance LAMMPS easyblock dynamically add ARMV81 and A64FX to Kokkos CPU mapping based on LAMMPS version + fix installation of Python bindings for LAMMPS >= 2Aug2023 + fix sanity check by doing MPI_Finalize #3036

Merged
merged 27 commits into from
Feb 14, 2024

Conversation

laraPPr
Copy link
Contributor

@laraPPr laraPPr commented Nov 20, 2023

No description provided.

@laraPPr laraPPr marked this pull request as draft November 22, 2023 08:45
@@ -118,6 +118,8 @@
'zen2': 'ZEN2',
'zen3': 'ZEN3',
'power9le': 'POWER9',
'neoverse_n1': 'ARMV81',
'neoverse_v1': 'ARMV81',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laraPPr Did you check whether more recent LAMMPS versions supported better targets for neoverse_v1?
If so, we can refine this for more recent LAMMPS versions (or use a better target by default, and fall back to ARMV81 for older LAMMPS versions).

Should we also include A64FX here (which archspec should recognize as such by producing a64fx, see https://github.com/archspec/archspec-json/blob/d844bb36b21dfb9ff5d04727edfc08f592fc06af/cpu/microarchitectures.json#L2716

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the most recent mappings but they do not seem to include better targets for neoverse_v1

Copy link
Member

@ocaisa ocaisa Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in a tricky situation here as you also need to update the KOKKOS_CPU_MAPPING and KOKKOS_GPU_ARCH_TABLE to match up the new architectures introduced but this has to be done in a backwards compatible way. Probably we need to allow that the values of these dictionaries may be dictionaries themselves with version keys and the appropriate return values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I than test this easyblock with older easyconfigs of LAMMPS to see if anything is broken?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or update the "constants" based on the LAMMPS version in the easyblock constructor?

Maybe using constants was just wrong here, and it needs to be done via class variables like self.kokkos_cpu_mapping instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could start doing something like this to update the mapping for particular versions of LAMMPS:

    def update_kokkos_cpu_mapping(self):
                
        if LooseVersion(self.cur_version) >= LooseVersion(translate_lammps_version('2Aug2023')):
            self.kokkos_cpu_mapping['a64fx'] = 'A64FX'
    
    def __init__(self, *args, **kwargs):
        """LAMMPS easyblock constructor: determine whether we should build with CUDA support enabled."""
        super(EB_LAMMPS, self).__init__(*args, **kwargs)

        self.kokkos_cpu_mapping = copy.deepcopy(KOKKOS_CPU_MAPPING)
        self.update_kokkos_cpu_mapping()

and keep KOKKOS_CPU_MAPPING as is, assuming that it's correct for all LAMMPS versions we care about

ARMV81 is supported since stable_29Oct2020 it seems, see lammps/lammps@60864e3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARMv80, ARMv81, ARMv8-ThunderX are actually supported since stable_31Mar2017, see lammps/lammps@a9f0b7d

Copy link
Contributor Author

@laraPPr laraPPr Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARMv8-TX2 is available since stable_29sep2021 (see lammps/lammps@39786b1) and A64FX is available since stable_29Sep2021 (see lammps/lammps@eea14c5)

@@ -118,6 +118,8 @@
'zen2': 'ZEN2',
'zen3': 'ZEN3',
'power9le': 'POWER9',
'neoverse_n1': 'ARMV81',
'neoverse_v1': 'ARMV81',
Copy link
Member

@ocaisa ocaisa Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in a tricky situation here as you also need to update the KOKKOS_CPU_MAPPING and KOKKOS_GPU_ARCH_TABLE to match up the new architectures introduced but this has to be done in a backwards compatible way. Probably we need to allow that the values of these dictionaries may be dictionaries themselves with version keys and the appropriate return values.

@laraPPr laraPPr marked this pull request as draft January 9, 2024 15:32
@laraPPr laraPPr marked this pull request as ready for review January 9, 2024 16:25
@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

  • SUCCESS kim-api-2.3.0-GCC-11.2.0.eb
  • SUCCESS LAMMPS-23Jun2022-foss-2021b-kokkos.eb

Build succeeded for 2 out of 2 (1 easyconfigs in total)
bear-pg0105u03b - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/87202567e8ce317119efd5735d2a163b for a full test report.

@laraPPr
Copy link
Contributor Author

laraPPr commented Feb 12, 2024

Test report by @laraPPr

Overview of tested easyconfigs (in order)

  • SUCCESS LAMMPS-23Jun2022-foss-2022a-kokkos.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
node3563.doduo.os - Linux RHEL 8.8 (Ootpa), x86_64, AMD EPYC 7552 48-Core Processor, Python 3.11.3
See https://gist.github.com/laraPPr/9a623b104dffe736d1da5cf463ef1838 for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

  • SUCCESS archspec-0.1.4-GCCcore-11.3.0.eb
  • SUCCESS Voro++-0.4.6-GCCcore-11.3.0.eb
  • SUCCESS kim-api-2.3.0-GCC-11.3.0.eb
  • SUCCESS ScaFaCoS-1.0.4-foss-2022a.eb
  • SUCCESS LAMMPS-23Jun2022-foss-2022a-kokkos.eb

Build succeeded for 5 out of 5 (1 easyconfigs in total)
bear-pg0105u03b - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/2d13eaa5594bc56269fccb1002619d0d for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0211u08b.bear.cluster - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz (cascadelake), Python 3.6.8
See https://gist.github.com/branfosj/61aefaa5805b043368e3fe9d28d36578 for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

  • SUCCESS LAMMPS-23Jun2022-foss-2021a-kokkos-CUDA-11.3.1.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0208u03a - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-40GB, 520.61.05, Python 3.6.8
See https://gist.github.com/branfosj/8ad56a5400670cb28c155521d125f773 for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

  • SUCCESS LAMMPS-23Jun2022-foss-2021b-kokkos-CUDA-11.4.1.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0208u25a - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-40GB, 520.61.05, Python 3.6.8
See https://gist.github.com/branfosj/f6a37ec2c9dc517b2509aac1650147b4 for a full test report.

@laraPPr
Copy link
Contributor Author

laraPPr commented Feb 12, 2024

Test report by @laraPPr

Overview of tested easyconfigs (in order)

  • SUCCESS LAMMPS-3Mar2020-foss-2020a-Python-3.8.2-kokkos.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
node3511.doduo.os - Linux RHEL 8.8 (Ootpa), x86_64, AMD EPYC 7552 48-Core Processor, Python 3.11.3
See https://gist.github.com/laraPPr/89b891dba96524228afec77ecc460018 for a full test report.

@easybuilders easybuilders deleted a comment from laraPPr Feb 12, 2024
Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some minor edits that don't affect any of the testing to date

easybuild/easyblocks/l/lammps.py Outdated Show resolved Hide resolved
easybuild/easyblocks/l/lammps.py Outdated Show resolved Hide resolved
@laraPPr
Copy link
Contributor Author

laraPPr commented Feb 12, 2024

@laraPPr
Copy link
Contributor Author

laraPPr commented Feb 12, 2024

Just tested on ARM but it failed see:

The arch detection was done correctly. So the changes made to the easybuild are not causing the failure

@laraPPr
Copy link
Contributor Author

laraPPr commented Feb 13, 2024

@laraPPr
Copy link
Contributor Author

laraPPr commented Feb 14, 2024

Test report by @laraPPr

Overview of tested easyconfigs (in order)

  • SUCCESS LAMMPS-3Mar2020-foss-2020a-Python-3.8.2-kokkos.eb
  • SUCCESS LAMMPS-23Jun2022-foss-2022a-kokkos.eb
  • SUCCESS LAMMPS-23Jun2022-foss-2021a-kokkos.eb
  • SUCCESS LAMMPS-23Jun2022-foss-2021b-kokkos.eb

Build succeeded for 4 out of 4 (4 easyconfigs in total)
node4010.donphan.os - Linux RHEL 8.8 (Ootpa), x86_64, Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz, 1 x NVIDIA NVIDIA A2, 535.154.05, Python 3.11.3
See https://gist.github.com/laraPPr/0c0c88adb4b2c6f368879b65a8d18d11 for a full test report.

Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ocaisa ocaisa dismissed boegel’s stale review February 14, 2024 15:30

Changes requested are included

@ocaisa ocaisa merged commit 326cd1a into easybuilders:develop Feb 14, 2024
47 checks passed
@ocaisa
Copy link
Member

ocaisa commented Feb 14, 2024

This took a while, thanks for your patience @laraPPr !

@boegel boegel changed the title enhance LAMMPS easyblock dynamically add ARMV81 and A64FX to Kokkos CPU mapping based on LAMMPS version + fix installation of Python bindings for LAMMPS >= 2Aug2023 enhance LAMMPS easyblock dynamically add ARMV81 and A64FX to Kokkos CPU mapping based on LAMMPS version + fix installation of Python bindings for LAMMPS >= 2Aug2023 + fix sanity check by doing MPI_Finalize Feb 14, 2024
laraPPr pushed a commit to laraPPr/software-layer that referenced this pull request Feb 15, 2024
trz42 pushed a commit to trz42/software-layer that referenced this pull request Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants