Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llext: Fix off-by-one in RISC-V truncation check #83198

Conversation

WorldofJARcraft
Copy link
Contributor

@WorldofJARcraft WorldofJARcraft commented Dec 19, 2024

The RISC-V architecture-specific relocations need to check whether each required relocation can fit into the modified instruction's immediate. All immediates in RISC-V are encoded as two's complement. The current truncation check has an off-by-one error for checking the maximum negative distance, as two's complement encoding can represent a negative value that is the maximum positive value plus one, causing LLEXT to refuse loading valid code.
This commit adds an additional condition to the check that fixes the aforementioned issue.

For a concrete example of the problem, consider the CB-type instruction: It has an 8-bit immediate with an explicit LSB of 0. The current code accepts relocations for CB-type instruction within a range of [-255,+255] from the operation's location, relying on standard RISC-V alignment requirements to ensure that the LSB is 0.
However, while the maximum positive jump distance of 255 is correct, the maximum negative jump distance for a CB-type instruction is actually -256 bit. Here is an example in a RISC-V assembler for a jump with -256 byte offset: https://luplab.gitlab.io/rvcodecjs/#q=c.beqz+a0,+-256&abi=false&isa=AUTO
Here is an example for trying to jump +256 byte (interpreted as -256 byte jump): https://luplab.gitlab.io/rvcodecjs/#q=c.beqz+a0,+256&abi=false&isa=AUTO
The RISC-V ISA manual states (somewhat misleadingly in my opinion) that the CB-type instruction can perform jumps in a [-256,+256] byte range: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-a8395e7-2024-12-16/riscv-privileged.pdf, page 167; I have raised a separate issue for the ISA manual: riscv/riscv-isa-manual#1781

@WorldofJARcraft WorldofJARcraft marked this pull request as ready for review December 19, 2024 10:29
@zephyrbot zephyrbot added the area: RISCV RISCV Architecture (32-bit & 64-bit) label Dec 19, 2024
teburd
teburd previously approved these changes Dec 20, 2024
Copy link
Collaborator

@teburd teburd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks great, it would be a good value add here to have a test validating this edge that clearly got tripped on

@WorldofJARcraft
Copy link
Contributor Author

I will add a second commit with a test for the error condition, using a cb-type instruction as an example.

@zephyrbot zephyrbot added the area: llext Linkable Loadable Extensions label Dec 20, 2024
@zephyrbot zephyrbot requested review from lyakh and pillo79 December 20, 2024 16:26
@WorldofJARcraft
Copy link
Contributor Author

@teburd a test that uses both the maximum positive and negative values of the immediate has been added in a separate commit.
I have tested manually that RISC-V tests succeed when the first commit in this PR has been applied and fail when it has not been applied. The error message is the one I was expecting: the llext could not be relocated.

@teburd
Copy link
Collaborator

teburd commented Dec 20, 2024

Small formatting issue, but the tests are much appreciated by me

teburd
teburd previously approved these changes Dec 20, 2024
Copy link
Collaborator

@teburd teburd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, though I expect a small force push will cause some churn again. Happy to reapprove again if needed

@WorldofJARcraft WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from 66d11e6 to 9381ec2 Compare December 21, 2024 06:04
teburd
teburd previously approved these changes Dec 21, 2024
pillo79
pillo79 previously approved these changes Dec 22, 2024
The RISC-V architecture-specific relocations need to check whether
each required relocation can fit into the modified instruction's
immediate. All immediates in RISC-V are encoded as two's complement.
The current truncation check has an off-by-one error for checking
the maximum negative distance, as two's complement encoding can
represent a negative value that is the maximum positive value plus
one, causing LLEXT to refuse loading valid code.
This commit adds an additional condition to the check that fixes
the aforementioned issue.

Signed-off-by: Eric Ackermann <[email protected]>
@WorldofJARcraft WorldofJARcraft dismissed stale reviews from pillo79 and teburd via e457c2a January 6, 2025 07:30
@WorldofJARcraft WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from 9381ec2 to e457c2a Compare January 6, 2025 07:30
@WorldofJARcraft
Copy link
Contributor Author

This branch has been rebased to main; I have tested again that undoing the fix in the first commit causes tests to fail in the expected manner.


/*
* 2+4 bytes for _backward_jump_target itself
* need to insert 122 padding bytes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct? The comment says 122 bytes, but the code in line 34 seems to be adding 256 bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, the comment is out-of-date, it should state that 252 bytes of padding are needed. The c.addi and ret instructions are 2 bytes each, and we need to have exactly 256 bytes between _do_jump and _backward_jump_target to trigger the edge case.
I will update this and also change ret to a c.jr compressed jump to make this more obvious.

@WorldofJARcraft WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from e457c2a to c6a42d9 Compare January 6, 2025 09:52
pillo79
pillo79 previously approved these changes Jan 7, 2025
Copy link
Collaborator

@pillo79 pillo79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor clarification - otherwise LGTM!

* we pad with 0xff here, which at the time of writing is an invalid opcode and causes
* an exception should it ever be executed
*/
.fill 126, 2, 0xff
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will actually fill the file with sequences of 0x00 0xff. Is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, this is not what I intended.
On second thought, my padding with invalid instructions to detect silent failures with incorrect relocation is not really future-proof - there is no guarantee that the padding will remain an invalid instruction forever.
Thus, I have changed the padding to use 2-byte return instructions instead and added padding for both possible jump directions, clearly and obviously causing a test failure if the relocation is not exactly correct (too short or in the wrong direction).

All immediates in RISC-V are encoded as two's complement. This commit
adds a test for relocating jumps that utilize the full range of the
immediate, in both positive and negative direction.
To this end, the test uses the compressed b-type (CB) instruction to
branch to its maximum negative (-256) and maximum positive (+254)
targets.
In case of test failure, expect relocating the corresponding llext to
fail.

Signed-off-by: Eric Ackermann <[email protected]>
@WorldofJARcraft WorldofJARcraft force-pushed the riscv-relocations-fix-too-small-jump-distance branch from c6a42d9 to 8ae53ed Compare January 9, 2025 13:56
Copy link
Collaborator

@pillo79 pillo79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the detail work, thanks for this!

@kartben kartben merged commit 4921ce2 into zephyrproject-rtos:main Jan 10, 2025
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: llext Linkable Loadable Extensions area: RISCV RISCV Architecture (32-bit & 64-bit)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants