llext: Fix off-by-one in RISC-V truncation check #83198

WorldofJARcraft · 2024-12-19T09:00:44Z

The RISC-V architecture-specific relocations need to check whether each required relocation can fit into the modified instruction's immediate. All immediates in RISC-V are encoded as two's complement. The current truncation check has an off-by-one error for checking the maximum negative distance, as two's complement encoding can represent a negative value that is the maximum positive value plus one, causing LLEXT to refuse loading valid code.
This commit adds an additional condition to the check that fixes the aforementioned issue.

For a concrete example of the problem, consider the CB-type instruction: It has an 8-bit immediate with an explicit LSB of 0. The current code accepts relocations for CB-type instruction within a range of [-255,+255] from the operation's location, relying on standard RISC-V alignment requirements to ensure that the LSB is 0.
However, while the maximum positive jump distance of 255 is correct, the maximum negative jump distance for a CB-type instruction is actually -256 bit. Here is an example in a RISC-V assembler for a jump with -256 byte offset: https://luplab.gitlab.io/rvcodecjs/#q=c.beqz+a0,+-256&abi=false&isa=AUTO
Here is an example for trying to jump +256 byte (interpreted as -256 byte jump): https://luplab.gitlab.io/rvcodecjs/#q=c.beqz+a0,+256&abi=false&isa=AUTO
The RISC-V ISA manual states (somewhat misleadingly in my opinion) that the CB-type instruction can perform jumps in a [-256,+256] byte range: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-a8395e7-2024-12-16/riscv-privileged.pdf, page 167; I have raised a separate issue for the ISA manual: riscv/riscv-isa-manual#1781

teburd

The change looks great, it would be a good value add here to have a test validating this edge that clearly got tripped on

WorldofJARcraft · 2024-12-20T15:27:18Z

I will add a second commit with a test for the error condition, using a cb-type instruction as an example.

WorldofJARcraft · 2024-12-20T16:28:06Z

@teburd a test that uses both the maximum positive and negative values of the immediate has been added in a separate commit.
I have tested manually that RISC-V tests succeed when the first commit in this PR has been applied and fail when it has not been applied. The error message is the one I was expecting: the llext could not be relocated.

teburd · 2024-12-20T18:05:54Z

Small formatting issue, but the tests are much appreciated by me

teburd

Approved, though I expect a small force push will cause some churn again. Happy to reapprove again if needed

The RISC-V architecture-specific relocations need to check whether each required relocation can fit into the modified instruction's immediate. All immediates in RISC-V are encoded as two's complement. The current truncation check has an off-by-one error for checking the maximum negative distance, as two's complement encoding can represent a negative value that is the maximum positive value plus one, causing LLEXT to refuse loading valid code. This commit adds an additional condition to the check that fixes the aforementioned issue. Signed-off-by: Eric Ackermann <[email protected]>

WorldofJARcraft · 2025-01-06T07:46:48Z

This branch has been rebased to main; I have tested again that undoing the fix in the first commit causes tests to fail in the expected manner.

lyakh · 2025-01-06T08:52:07Z

tests/subsys/llext/src/riscv_edge_case_cb_type_trigger.S

+
+    /*
+     * 2+4 bytes for _backward_jump_target itself
+     * need to insert 122 padding bytes


is this correct? The comment says 122 bytes, but the code in line 34 seems to be adding 256 bytes?

You are correct, the comment is out-of-date, it should state that 252 bytes of padding are needed. The c.addi and ret instructions are 2 bytes each, and we need to have exactly 256 bytes between _do_jump and _backward_jump_target to trigger the edge case.
I will update this and also change ret to a c.jr compressed jump to make this more obvious.

pillo79

One minor clarification - otherwise LGTM!

pillo79 · 2025-01-07T11:46:07Z

tests/subsys/llext/src/riscv_edge_case_cb_type_trigger.S

+     * we pad with 0xff here, which at the time of writing is an invalid opcode and causes
+     * an exception should it ever be executed
+     */
+    .fill 126, 2, 0xff


This will actually fill the file with sequences of 0x00 0xff. Is this intentional?

You are correct, this is not what I intended.
On second thought, my padding with invalid instructions to detect silent failures with incorrect relocation is not really future-proof - there is no guarantee that the padding will remain an invalid instruction forever.
Thus, I have changed the padding to use 2-byte return instructions instead and added padding for both possible jump directions, clearly and obviously causing a test failure if the relocation is not exactly correct (too short or in the wrong direction).

All immediates in RISC-V are encoded as two's complement. This commit adds a test for relocating jumps that utilize the full range of the immediate, in both positive and negative direction. To this end, the test uses the compressed b-type (CB) instruction to branch to its maximum negative (-256) and maximum positive (+254) targets. In case of test failure, expect relocating the corresponding llext to fail. Signed-off-by: Eric Ackermann <[email protected]>

pillo79

Love the detail work, thanks for this!

WorldofJARcraft marked this pull request as ready for review December 19, 2024 10:29

zephyrbot added the area: RISCV RISCV Architecture (32-bit & 64-bit) label Dec 19, 2024

zephyrbot requested review from carlocaione, edersondisouza, fkokosinski, katsuster, kgugala, mgielda, npitre, tgorochowik and ycsin December 19, 2024 10:29

zephyrbot assigned fkokosinski, kgugala and tgorochowik Dec 19, 2024

teburd previously approved these changes Dec 20, 2024

View reviewed changes

WorldofJARcraft dismissed teburd’s stale review via 66d11e6 December 20, 2024 16:26

zephyrbot added the area: llext Linkable Loadable Extensions label Dec 20, 2024

zephyrbot requested review from lyakh and pillo79 December 20, 2024 16:26

teburd previously approved these changes Dec 20, 2024

View reviewed changes

WorldofJARcraft dismissed teburd’s stale review via 9381ec2 December 21, 2024 06:04

WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from 66d11e6 to 9381ec2 Compare December 21, 2024 06:04

teburd previously approved these changes Dec 21, 2024

View reviewed changes

pillo79 previously approved these changes Dec 22, 2024

View reviewed changes

WorldofJARcraft dismissed stale reviews from pillo79 and teburd via e457c2a January 6, 2025 07:30

WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from 9381ec2 to e457c2a Compare January 6, 2025 07:30

lyakh reviewed Jan 6, 2025

View reviewed changes

WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from e457c2a to c6a42d9 Compare January 6, 2025 09:52

pillo79 previously approved these changes Jan 7, 2025

View reviewed changes

WorldofJARcraft dismissed pillo79’s stale review via 8ae53ed January 9, 2025 13:56

WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from c6a42d9 to 8ae53ed Compare January 9, 2025 13:56

pillo79 approved these changes Jan 9, 2025

View reviewed changes

teburd approved these changes Jan 9, 2025

View reviewed changes

fkokosinski approved these changes Jan 10, 2025

View reviewed changes

kartben merged commit 4921ce2 into zephyrproject-rtos:main Jan 10, 2025
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llext: Fix off-by-one in RISC-V truncation check #83198

llext: Fix off-by-one in RISC-V truncation check #83198

WorldofJARcraft commented Dec 19, 2024 •

edited

Loading

teburd left a comment

WorldofJARcraft commented Dec 20, 2024

WorldofJARcraft commented Dec 20, 2024

teburd commented Dec 20, 2024

teburd left a comment

WorldofJARcraft commented Jan 6, 2025

lyakh Jan 6, 2025

WorldofJARcraft Jan 6, 2025

pillo79 left a comment

pillo79 Jan 7, 2025

WorldofJARcraft Jan 9, 2025

pillo79 left a comment

llext: Fix off-by-one in RISC-V truncation check #83198

llext: Fix off-by-one in RISC-V truncation check #83198

Conversation

WorldofJARcraft commented Dec 19, 2024 • edited Loading

teburd left a comment

Choose a reason for hiding this comment

WorldofJARcraft commented Dec 20, 2024

WorldofJARcraft commented Dec 20, 2024

teburd commented Dec 20, 2024

teburd left a comment

Choose a reason for hiding this comment

WorldofJARcraft commented Jan 6, 2025

lyakh Jan 6, 2025

Choose a reason for hiding this comment

WorldofJARcraft Jan 6, 2025

Choose a reason for hiding this comment

pillo79 left a comment

Choose a reason for hiding this comment

pillo79 Jan 7, 2025

Choose a reason for hiding this comment

WorldofJARcraft Jan 9, 2025

Choose a reason for hiding this comment

pillo79 left a comment

Choose a reason for hiding this comment

WorldofJARcraft commented Dec 19, 2024 •

edited

Loading