You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While implementing the validation and decoding of ZFS Receive Resume Tokens for a personal project, I noticed that the checksum of the tokens produced by ZFS does not match the one I computed myself using the Fletcher4 checksum algorithm. Therefore I took a deeper look into the ZFS code and noticed something unexpected:
The get_receive_resume_token_impl function uses fletcher_4_native_varsize to compute a checksum over the token's payload. As documented in the commit introducing this function, the fletcher_4_native_varsize will ignore the last bytes of the input data if its size is not a multiple of 4:
Added fletcher_4_native_varsize() special purpose method for use when buffer size
is not known in advance. The method does not enforce 4B alignment on buffer size, and
will ignore last (size % 4) bytes of the data buffer.
Because the receive resume token payloads often have a size that is not a multiple of 4, the last bytes of them will not be covered by the token's checksum and corruptions there may be left unnoticed.
While this is a rather theoretical issue, I still wanted to raise awareness to this and am wondering, if this behavior is intended or if this may have been an accident during one of the last refactorings of the fletcher_4 code?
The text was updated successfully, but these errors were encountered:
Doing some archeology I found in commit fc897b2 such point:
- Added `fletcher_4_native_varsize()` special purpose method for use when buffer size
is not known in advance. The method does not enforce 4B alignment on buffer size, and
will ignore last (size % 4) bytes of the data buffer.
It sounds weird to me, but that seems like what it is.
I guess the only way to fix it without breaking compatibility would be to pad the buffer, but since it is GZIP-compressed the question is whether GZIP allows padding after the compressed zdata to be quietly dropped on decompression.
While implementing the validation and decoding of ZFS Receive Resume Tokens for a personal project, I noticed that the checksum of the tokens produced by ZFS does not match the one I computed myself using the Fletcher4 checksum algorithm. Therefore I took a deeper look into the ZFS code and noticed something unexpected:
The
get_receive_resume_token_impl
function usesfletcher_4_native_varsize
to compute a checksum over the token's payload. As documented in the commit introducing this function, thefletcher_4_native_varsize
will ignore the last bytes of the input data if its size is not a multiple of 4:This can be seen here:
zfs/module/zcommon/zfs_fletcher.c
Line 321 in 89f796d
Because the receive resume token payloads often have a size that is not a multiple of 4, the last bytes of them will not be covered by the token's checksum and corruptions there may be left unnoticed.
While this is a rather theoretical issue, I still wanted to raise awareness to this and am wondering, if this behavior is intended or if this may have been an accident during one of the last refactorings of the fletcher_4 code?
The text was updated successfully, but these errors were encountered: