Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance for adding the 1PL-IRT model #147

Open
giacomoran opened this issue Jan 2, 2025 · 8 comments
Open

Guidance for adding the 1PL-IRT model #147

giacomoran opened this issue Jan 2, 2025 · 8 comments

Comments

@giacomoran
Copy link

Hi,

I want to start contributing to this repo by adding the 1PL-IRT model (also known as Rasch model), which I think is a great baseline for SRS.

The 1PL-IRT model estimates:

  • Person ability parameters ($\theta$) (one parameter for each user)
  • Item difficulty parameters ($\beta$) (one parameter for each card)

The model assumes the probability of a correct response is:
$$P(correct) = \sigma(\theta - \beta)$$
where $\sigma$ is the logistic function.


I have a few questions:

  • 1PL-IRT was originally developed outside the context of SRS, items (aka cards) are typically shared between users. In the 10k Anki collection dataset, given two cards of different users but with the same ID, should they be considered the same card? For example, coming from the same shared deck? Otherwise the model simplifies and can be trained independently for each user.
  • I've been looking at other.py, in particular to the DASH family of models which pretty much extend 1PL-IRT with review history data. The per-user and per-card parameters seem to be missing, was that intentional? Why?
@Expertium
Copy link
Contributor

Expertium commented Jan 2, 2025

  1. Given two cards of different users but with the same ID, should they be considered the same card?

That is very unlikely to occur. IDs are generate from UNIX timestamps with millisecond resolution, so the only way for two cards to have the same ID so for them to be created at exactly the same time, down to 1/1000 of a second. That being said, if they are from the same shared deck, yes, it's possible. But I also want to confirm this with @L-M-Sherlock, I'm not 100% sure.
But anyway, all algorithms in our benchmark assume independence of cards. We have tested using information from sibling cards in FSRS, but the results were not promising.

  1. I've been looking at other.py, in particular to the DASH family of models which pretty much extend 1PL-IRT with review history data. The per-user and per-card parameters seem to be missing, was that intentional? Why?

Idk about that, you'd have to wait for LMSherlock to respond. Those models are the ones I am least familiar with.
On second thought, I'm not sure what you mean. Every parameter is per-user (well, per-collection, technically), they are optimized for each collection independently.
Also, weren't you helping LMSherlock with implementing DASH? #51

@giacomoran
Copy link
Author

if they are from the same shared deck, yes, it's possible

Every parameter is per-user (well, per-collection, technically), they are optimized for each collection independently

I see. Given these two points, I think the benchmark deviates from the literature introducing models like like 1PL-IRT and the DASH family. Those models are usually trained across user collections. But I see the point of further optimizing for each individual user.

The benchmark includes "FSRS-5 default param.", which is trained on the entire 10k collections. Would it be possible to do the same for other models?

@Expertium
Copy link
Contributor

It's not exactly "trained on all 10k collections": we don't combine them into one giant collection. Instead, we optimize FSRS on every collection individually, and then take the median of each parameter, and then use those median parameters.

@giacomoran
Copy link
Author

I'd be curious to see the benchmark results of models trained on the "one giant collection".

Is there anything in the training/testing setup blocking this?

@Expertium
Copy link
Contributor

@L-M-Sherlock there are some questions here that you should be able to answer better than me

@L-M-Sherlock
Copy link
Member

Is there anything in the training/testing setup blocking this?

My device's RAM will cry.

@L-M-Sherlock
Copy link
Member

L-M-Sherlock commented Jan 5, 2025

Do you have any other suggestion or further question?

@giacomoran
Copy link
Author

Not right now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants