Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New prng #552

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

New prng #552

wants to merge 3 commits into from

Conversation

PiOverFour
Copy link

Checklist

  • The new code additions passed the tests (npm test).
  • The linter ran and found no issues (npm run-script lint).

Description

Hello, while writing a story, I stumbled upon a weird edge case with the PRNG. When evaluating a shuffle with 14 elements (or, I suspect, any multiple of 7, as I confirmed by using 7 and 21 elements), the results are very poor. Here is my test case:

   {~1|2|3|4|5|6|7|8|9|10|11|12|13|14}
<> {~1|2|3|4|5|6|7|8|9|10|11|12|13|14}
<> {~1|2|3|4|5|6|7|8|9|10|11|12|13|14}
<> {~1|2|3|4|5|6|7|8|9|10|11|12|13|14}
<> {~1|2|3|4|5|6|7|8|9|10|11|12|13|14}
   {~Lorem|Ipsum|Dolor|Sit|Amat|Praesent|Fermentum|Tempor|Tellus|Etiam|Vel|Neque|Nec|Dignissim}
<> {~Lorem|Ipsum|Dolor|Sit|Amat|Praesent|Fermentum|Tempor|Tellus|Etiam|Vel|Neque|Nec|Dignissim}
<> {~Lorem|Ipsum|Dolor|Sit|Amat|Praesent|Fermentum|Tempor|Tellus|Etiam|Vel|Neque|Nec|Dignissim}
<> {~Lorem|Ipsum|Dolor|Sit|Amat|Praesent|Fermentum|Tempor|Tellus|Etiam|Vel|Neque|Nec|Dignissim}
<> {~Lorem|Ipsum|Dolor|Sit|Amat|Praesent|Fermentum|Tempor|Tellus|Etiam|Vel|Neque|Nec|Dignissim}

-> shuffle_test

= shuffle_test
+ [number]
  {~one|two|three|four|five|six|seven|eight|nine|ten|eleven|twelve|thirteen|fourteen}
  -> shuffle_test
* exit ->END

And here is the result:

8 8 8 1 1
Lorem Tempor Tempor Lorem Lorem
one
six
two
eleven
twelve
four
seven

As you can see, the shuffle works as expected when running a single shuffle multiple times, but breaks when running new shuffles many times. This can be useful to get simple variation in different story runs, but in this case they will always have the same (n/7) values.

After a bit of investigation, I went to the gist whence the PRNG comes from, and it’s been recently updated to warn against using it*. User bryc has a page recommanding alternatives. Now my math skills are far from sufficient to evaluate the reliability of these options, so I just trust that user’s judgement…

I replaced the PRNG with xoroshiro64**. I don’t know how to test it for speed or probability distribution, but at least it appears to give good results in my test case.

I updated the tests to match the new PRNG, but didn’t understand what I was supposed to do with the javascript tests, which don’t seem to be version-controlled…


The other modification isn’t entirely related, but I think it is an improvement. It simplifies seed initialisation, so that the timestamp doesn’t go through the PRNG and through a modulo(100), which limits the number of possible seeds. Now maybe there is a reason I’m not seeing for this, but as far as I understand it only reduces the number of possible playthroughs…


* I also suspect there is something wrong with the way we reseed the generator at every iteration with

let resultSeed = this.state.storySeed + this.state.previousRandom;
let random = new PRNG(resultSeed);

instead of just keeping its state and using next(), but since that’s the way ink does it, I guess it’s simpler to keep that…

@PiOverFour
Copy link
Author

Aaargh I’m sorry, don’t merge this!

While testing this new PRNG a bit further, I saw other cases where the generation was bad, such as when using 10 elements.

I’m pretty sure it’s related to the reseeding of the generator at every step. Are you opposed to adding a PRNG member to the StoryState class, and just calling its next() method every time we need a random number? I’m pretty sure that would fix it, but it would depart from the C# ink implementation.

@ephread
Copy link
Collaborator

ephread commented Jun 11, 2020

Related to #31.

@y-lohse I confirm there's a severe bias in the PRNG. I checked with 200 throws across a hundred reseeds and the bias is very stable. The PRNG only outputs 1s and 8s.

Thanks catching and fixing @PiOverFour!

After a bit of investigation, I went to the gist whence the PRNG comes from, and it’s been recently updated to warn against using it

A compelling reason to replace it anyway.

I updated the tests to match the new PRNG, but didn’t understand what I was supposed to do with the javascript tests, which don’t seem to be version-controlled…

All good, they are generated from TypeScript. Updating the TypeScript tests is enough (thanks for doing so by the way). The linter is complaining though, but I'm not certain it's related to this PR.

The other modification isn’t entirely related, but I think it is an improvement. It simplifies seed initialisation, so that the timestamp doesn’t go through the PRNG and through a modulo(100), which limits the number of possible seeds. Now maybe there is a reason I’m not seeing for this, but as far as I understand it only reduces the number of possible playthroughs…

I also suspect there is something wrong with the way we reseed the generator at every iteration with instead of just keeping its state and using next(), but since that’s the way ink does it, I guess it’s simpler to keep that…

Let me summon @joethephish, he'll know more about it.

Are you opposed to adding a PRNG member to the StoryState class, and just calling its next() method every time we need a random number? I’m pretty sure that would fix it, but it would depart from the C# ink implementation.

Since the PRNGs are different between inkjs and ink anyway, I think it would be okay.

@y-lohse
Copy link
Owner

y-lohse commented Jun 12, 2020

Yep, good catch!

Since the PRNGs are different between inkjs and ink anyway, I think it would be okay.

The thing is, right now the differences are scope to the PRNG.js file. We can divert from upstream in Story and StoryState, but it seems we are talking about changes that would benefit upstream as well.

@joethephish can you give some context on the % 100 here and the seed changes here?

Ideally this would be the occasion where both implementations converge to the same PRNG and we end up with no differences at all ! But if that doesn't work out, we can definitely consider some divergences here :)

@PiOverFour
Copy link
Author

PiOverFour commented Jun 12, 2020

I pushed a new version to add to the discussion (here is a backup of the previous proposition). It changes the logic for number generation, by storing the PRNG instance in the StoryState. I didn’t realise that yesterday, but this logic is actually unrelated to the shuffles’ error, since their states are determined by their hash and not by the other previous numbers. I still feel it’s a better practice though, but I haven’t found anything wrong with the current implementation, so feel free to disregard that change.

It also fixes a problem I noticed with my previous proposition: xoroshiro64** uses two state variables, which complexifies the StoryState and may make saves from previous versions impossible to load. The new version uses Mulberry32 instead, as it has only one variable of state. I still don’t know how it performs, but it’s also a recommended algorithm, and one which better fits the requirements.

This probably needs further testing for quality probability distributions but I don’t know too well how to…

@PiOverFour PiOverFour force-pushed the new_prng branch 2 times, most recently from cdf9f4f to 0559167 Compare June 13, 2020 20:35
@y-lohse
Copy link
Owner

y-lohse commented Mar 29, 2021

It looks like I forgot to follow up on this, apologies 😬

I'm all for replacing the faulty PRNG, but there are some constraints:

  • The API of the generator should stay the same. We try to keep differences with upstream code to a minimum, so I'd rather not introduce changes in StoryState for the PRNG.
  • If we use a dependency for this, it should be lightweight

@PiOverFour
Copy link
Author

Hello @y-lohse, no worry, I’m not using Ink enough these days to be in a hurry :D

  • About the StoryState, I’ve forgotten a bit how this works, sorry. Do you mean we should keep only the previousNumber in the StoryState, instead of the whole PRNG as in my proposition? And then create a new PRNG object each time we need a new random number, the way it currently is? If so, this should be easy enough to revert.
  • I’m not sure which dependency you’re referring to; the new PRNG is the mulberry32 one taken from here. It is self-contained in PRNG.ts, and not that much bigger than the current PRNG. Am I missing something else?

@y-lohse
Copy link
Owner

y-lohse commented Mar 30, 2021

Do you mean we should keep only the previousNumber in the StoryState, instead of the whole PRNG as in my proposition?

Yes! Ideally there should be no change in StoryState :) Cool if that can be achieved easily!

Am I missing something else?

No it's all good, I was mentioning this in case the previous point was too much of an issue and we wanted to introduce a dependency rather than embedding an implementation :)

@PiOverFour
Copy link
Author

Oh! right, thanks.
I won’t get back to this right away, but I’ll keep you posted. I hope it is easy!

@PiOverFour
Copy link
Author

I worked a bit on this and I believe it can be done without changes to the storyState. However, for the PRNG to be used as intended, some potentially breaking changes need to happen.

  • The previousRandom is now only the previous seed from the PRNG, instead of the previous seed + the storySeed. In effect, the storySeed is no longer used at all.
  • the PRNG now implements two methods: public next() andpublic nextSeed(). The former is the number actually used in the game, and the latter is the internal state value of the PRNG, which is stored in previousRandom.

This is completely untested, but I wanted your opinion on the direction it’s taking before proceeding. This is also why I haven’t updated the tests.

@PiOverFour PiOverFour force-pushed the new_prng branch 2 times, most recently from 35f5b34 to 08bd60e Compare March 31, 2021 20:40
@PiOverFour
Copy link
Author

I tested the code on the test case from the original post and it works fine. I just rewrote the commit messages to better describe what is happening.

The only reservation I have is that previousRandom is now a bit of a misnomer, since it doesn’t actually store the previous generated number, but an intermediate one used only by the PRNG as its state. It seems to me that this isn’t a problem though, since the value is never directly accessed by the user.

Also, I’m not sure this qualifies as a change of API, but it still introduces significant changes from upstream… I couldn’t find a way to avoid this while using the PRNG properly.

@y-lohse
Copy link
Owner

y-lohse commented Apr 11, 2021

Thank you for the update! I'm afraid I'm going to be difficult again 😭

the storySeed is no longer used at all

I think this is a problem 😬 Two playthroughs with the same seed should result in the same story. If the story seed is ignored I think that's not the case anymore?

If so. maybe we can tweak things a little? I think the other changes are fine (the nextSeed function and the storing of a seed i previousRandom rather than a result value.) What happens if we change let resultSeed = this.state.previousRandom; to let resultSeed = this.state.storySeed + this.state.previousRandom; and initialize previousRandom to 0? storySeed would still by initialized to a random value (current time), but if the PRNG is seeded the results would be consistent?

@PiOverFour
Copy link
Author

PiOverFour commented Apr 11, 2021

Oh no, actually the date is used to seed the generator, same as before. The storySeed (which was and is computed from the initial date) is still used in seeding the PRNG, but then only the previous state is needed.
From Wikipedia:

A pseudorandom number generator's number sequence is completely determined by the seed: thus, if a pseudorandom number generator is reinitialized with the same seed, it will produce the same sequence of numbers.

So two playthroughs will be different if they are not seeded with the same number, but it was already the case. It’s just that now, the initial seed is not kept: the state of the generator depends only on the previous state, as most PRNGs do.

Actually, I think it’s easier to get a predictable run now than before, because in order to replay from a branch, you only need to know the previous state at this particular branch, instead of the previous state plus the initial state (storySeed).

EDIT I realise that this may sound condescending. It is not my intention, it's just that as far as I understand, modifying the state at each step by adding a constant number to it — the StorySeed — gives no benefit, and may even result in a poorer number distribution.

@y-lohse
Copy link
Owner

y-lohse commented Apr 13, 2021

I realise that this may sound condescending. It is not my intention

No problem 😁

I think you might have missed this bit: https://github.com/y-lohse/inkjs/blob/master/src/engine/Story.ts#L1215 which is explained here https://github.com/inkle/ink/blob/master/Documentation/WritingWithInk.md#seed_random .

Basically, yes the storySeed is still initialized using a date, but it can be overriden from within the ink script itself. And unless I've missed something, doing that would have no effect at all with the current PR. Unless I'm the one missing something?

@PiOverFour
Copy link
Author

Damn, you’re right, I missed this bit! Makes sense.

But this:

          this.state.storySeed = seed.value;
          this.state.previousRandom = 0;

can be easily replaced by:

          this.state.previousRandom = seed.value;

And the seed becomes the new initial state for the generator. Thus, the state can even be restored from a saved seed/state.

I’ll update the PR as soon as possible.

The new algorithm, Mulberry32, doesn't return its internal state as
the generated random number. This means that previousRandom, the value
we keep between generations, should not be the actual previous random
number generated, but the previous seed normally kept inside the
generator. To do that, a new nextSeed() method is added.

Also, the storySeed is no longer fed into the PRNG, but only used on
first initialization.
The storySeed now takes the timestamp directly and the initial seed is
stored as the storySeed.

It also no longer uses modulo 100, which should give a lot more
variety.
@PiOverFour
Copy link
Author

I implemented that and upon testing and re-reading the code, noticed I'd forgotten to modify the shuffle altogether. Now I understand better why the storySeed is used at each generation, and in the case of the shuffle, I don't see a way to change it without getting too far from upstream.

Anyway, I tested again on various small tests, and it seems to behave well.

@smwhr smwhr mentioned this pull request Apr 12, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants