Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbiased random selection from a set #3

Open
cipriancraciun opened this issue Nov 16, 2024 · 4 comments
Open

Unbiased random selection from a set #3

cipriancraciun opened this issue Nov 16, 2024 · 4 comments

Comments

@cipriancraciun
Copy link

cipriancraciun commented Nov 16, 2024

After making my comment about the Spectre(.app) issue, I've also looked at your algorithm again, and I've seen that it also suffers from the biased random sampling due to the % usage.

For example:

randomIndex = LE128(ciphertext.Slice(start: i * 16, length: 16)) % characterSet.Length
randomNumber = LE128(ciphertext.Slice(start: 0, length: 16)) % characterSet.Length
randomPosition = LE128(ciphertext.Slice(start: 16, length: 16)) % wordCount
randomIndex = LE128(ciphertext.Slice(start: (i + offset) * 16, length: 16)) % wordlist.Length

In all cases, if the *.Length is not prime, (or coprime with 2^128 I believe), then there is a slight bias in the frequencies with which certain items are selected.

For more details, and a potential solution, see for example the following article:
=> https://dotat.at/@/2022-04-20-really-divisionless.html

Alternatively, perhaps even better, use a cryptographic library that provides unbiased random sampling.

Alternatively (2), for example see how Rust implements unbiased sampling and perhaps adapt from there:
=> https://github.com/rust-random/rand/blob/7fe350c9bac6b11a84687a7e8e33e6fd8e0a8c01/src/distr/uniform_int.rs#L211-L258

@cipriancraciun
Copy link
Author

(I've copied your reply from the other issue.)

Yep, I'm avoiding that using the Simple Modular Method.

Perhaps update the README to specify this implementation detail, because I didn't look at the implementation, just the README.

I think it's OK to keep the README simple, so people can follow it, however, someone might use it to implement it's own thing, thus I believe having at least some warnings about the subtleties of the implementation might save one from doing the wrong thing.

@samuel-lucas6
Copy link
Owner

Perhaps update the README to specify this implementation detail, because I didn't look at the implementation, just the README.

I could expand on this in the README, but Site password derivation should be free from modulo bias is one the security goals.

I think it's OK to keep the README simple, so people can follow it, however, someone might use it to implement it's own thing, thus I believe having at least some warnings about the subtleties of the implementation might save one from doing the wrong thing.

The pseudocode uses unsigned 128-bit integers/LE128, so an interoperable implementation shouldn't be able to make this mistake.

@cipriancraciun
Copy link
Author

I could expand on this in the README, but "Site password derivation should be free from modulo bias" is one the security goals.

Fair enough, but I've missed that... (And perhaps many like me would just skim through that section, and jump straight to the pseudo-code.)


The pseudocode uses unsigned 128-bit integers/LE128, so an interoperable implementation shouldn't be able to make this mistake.

LE128(...) % characterSet.Length

Yes, but characterSet.Length is the divisor here, which is the one that introduces the bias, right?

@samuel-lucas6
Copy link
Owner

Yes, but characterSet.Length is the divisor here, which is the one that introduces the bias, right?

The Simple Modular Method should mean that the skew is so small that it doesn't matter. NIST recommends at least 64 extra bits (so 9 bytes on the left of %), which is exceeded by using 128-bit integers (16 bytes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants