Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed models / data files for cmdstan #171

Merged
merged 25 commits into from
Jun 10, 2020
Merged

Conversation

PhilClemson
Copy link
Contributor

This basically fixes the majority of models that were either broken or had missing / incorrect syntax for the R dump data files used by cmdstan. They can be checked using the script found in the performance tests repository (https://github.com/stan-dev/performance-tests-cmdstan), e.g.:

python ./performance-tests-cmdstan/runPerformanceTests.py directories "performance-tests-cmdstan/example-models" --overwrite-gold --method=sample --num-samples=1000

Here is a list of the updated models:

misc/ecology/occupancy/occupancy
misc/ecology/mark-recapture/mark-recapture
misc/ecology/mark-recapture/cjs-K
misc/ecology/mark-recapture/mark-recapture-2
misc/ecology/mark-recapture/mark-recapture-3
misc/ecology/mark-recapture/cjs
misc/linear-regression_regression
misc/linear-regression_regression_std
misc/garch_arch1
misc/garch_koyck
misc/garch_garch1_1
misc/hier_multivariate_hier_multivariate
misc/moving-avg/arma11
misc/moving-avg/arma11_alt
misc/moving-avg/maQ
misc/moving-avg/stochastic-volatility
misc/moving-avg/ma2
misc/moving-avg/stochastic-volatility-optimized
misc/gaussian-process/gp-predict-analytic
misc/gaussian-process/gp-sim-multi
misc/gaussian-process/gp-fit-ARD
misc/gaussian-process/gp-fit-latent
misc/dlm/fx/equicorr
misc/dlm/fx/factor
misc/sur/sur/imrpoper
misc/sur/sur
misc/hmm/hmm
misc/hmm/hmm-analytic
misc/hmm/hmm-semisup
misc/hmm/hmm-sufficient
misc/nnmf/nnmf
misc/nnmf/nnmf/vec
knitr/simplest-regression/fake-data
knitr/pool-binary-trials/pool
knitr/pool-binary-trials/no-pool
knitr/pool-binary-trials/hier
knitr/pool-binary-trials/hier-logit-centered
knitr/pool-binary-trials/hier-logit
knitr/chapter2/worldcup/fixed
knitr/chapter2/worldcup/first/try
knitr/chapter2/worldcup/first/try/noprior
knitr/chapter2/worldcup/with/replication
knitr/chapter2/golf1
knitr/chapter2/worldcup/no/sqrt
knitr/car-iar-poisson/pois/re
knitr/car-iar-poisson/pois
knitr/car-iar-poisson/bym2/islands
knitr/car-iar-poisson/bym2/offset/only
knitr/car-iar-poisson/pois/icar
knitr/car-iar-poisson/bym2
knitr/car-iar-poisson/simple/iar
knitr/chapter1/fake-data
knitr/golf/golf1
knitr/world-cup/worldcup/fixed
knitr/world-cup/worldcup/first/try
knitr/world-cup/worldcup/first/try/noprior
knitr/world-cup/worldcup/with/replication
knitr/world-cup/worldcup/no/sqrt
ARM/Ch.25/earnings2
ARM/Ch.25/earnings
ARM/Ch.25/earnings/pt2
ARM/Ch.25/earnings/pt1
education/tutorial/twopl/reg/dif
education/tutorial/twopl/reg/noncentered
education/tutorial/twopl/reg/centered
education/tutorial/twopl/twopl
education/hierarchical/2pl/hierarchical/2pl
education/dina/independent/dina/independent
education/dina/independent/dina/nostructure
BPA/Ch.04/GLMM1
BPA/Ch.04/GLMM2
BPA/Ch.13/bluebug
BPA/Ch.13/owls/ms1
BPA/Ch.13/owls/ms2
BPA/Ch.13/Dynocc2
BPA/Ch.10/js/ms
BPA/Ch.12/binmix/cov
BPA/Ch.12/binmix
BPA/Ch.12/Nmix0
BPA/Ch.12/Nmix2
BPA/Ch.12/Nmix1
BPA/Ch.07/cjs/trap
BPA/Ch.07/cjs/temp/corr

alecksphillips and others added 20 commits February 25, 2020 16:03
…tting 'Initialization between (-2, 2) failed after 100 attempts' errors.
Added data files for pool-binary-trials
Added missing data for knitr/car-iar-poisson models
Added missing data for golf1 models
@mitzimorris
Copy link
Member

mitzimorris commented Apr 30, 2020

where did you get the data from?
I realize that in the past there have been attempts to add data - cf #151 - these were done in a brute-force way. After the PR associated with that issue went in, I cleaned up part of it by removing generated data from case studies where there wasn't any data.

it's important that the data used to test the model should be generated by the model or the process which the model is trying to approximate. using just the specified constraints on data and parameters to generate test is not enough - there is often domain knowledge which is not specified, e.g., for the icar models, the input data describes an areal map, and there are constraints on the relations between the map edges which aren't directly expressed in the data declarations.

for testing purposes, I would imagine that we're interested in testing well-specified models which are fitting data generated from the model and/or which is somehow pathological. this requires that all test data be annotated explaining what kind of data it is and what kind of behavoir we expect when fitting the model to the data.

@mitzimorris
Copy link
Member

hi @PhilClemson - please allow me to clarify:

I think it would be better to curate the models used for testing, not make sure that we can use every single one of them. so I don't think this PR is a good idea.

@bob-carpenter
Copy link
Contributor

I also think a curated set of test models makes much more sense for testing than all the stuff that's accumulated in our example-models repo.

Having said that, I would like to see something done to clean up example-models, which means:

  • fixing models that we want to keep
  • removing junk

In the first category are models there for pedagogical purposes, not for testing purposes, like the ones in directory BPA (from the Bayesian Population Analysis book). It'd be nice to at least clean those up. Pretty much everything in misc is just junk I threw up over the years for the manual or just as an example for someone. That can all be eliminated.

How did you generate new data for things like my hierarchical binary pooling example in knitr/pool-binary-trials?

@mitzimorris
Copy link
Member

@PhilClemson - w/r/t to the icar-car-poisson models, I don't think your correction of the stan code is correct. did you test it on an areal map data that contained islands/singletons?

@PhilClemson
Copy link
Contributor Author

where did you get the data from?
I realize that in the past there have been attempts to add data - cf #151 - these were done in a brute-force way. After the PR associated with that issue went in, I cleaned up part of it by removing generated data from case studies where there wasn't any data.

it's important that the data used to test the model should be generated by the model or the process which the model is trying to approximate. using just the specified constraints on data and parameters to generate test is not enough - there is often domain knowledge which is not specified, e.g., for the icar models, the input data describes an areal map, and there are constraints on the relations between the map edges which aren't directly expressed in the data declarations.

for testing purposes, I would imagine that we're interested in testing well-specified models which are fitting data generated from the model and/or which is somehow pathological. this requires that all test data be annotated explaining what kind of data it is and what kind of behavoir we expect when fitting the model to the data.

Hi @mitzimorris, thanks for taking a look through this.

Most of the data came from the .R and documentation found in the same folder as the stan files. For the BPA models I downloaded the data from the links provided. My main task was to standardise the models so that they could all be run using cmdstan. Fortunately in most cases it was just a case of generating the data.R file from existing code / data.

There were a few cases where we had to generate "compatible" data but not "curated" data. The knitr models are a good example of this.

hi @PhilClemson - please allow me to clarify:

I think it would be better to curate the models used for testing, not make sure that we can use every single one of them. so I don't think this PR is a good idea.

I understand, we just thought we would share these changes in the hope that they could be applied in general. We've found that having a large selection of models / data is useful, even if just for identifying bugs. It would definitely not be appropriate to judge performance statistics based on individual models, although looking at performance over a large population of models might still be useful.

I also think a curated set of test models makes much more sense for testing than all the stuff that's accumulated in our example-models repo.

Having said that, I would like to see something done to clean up example-models, which means:

* fixing models that we want to keep

* removing junk

In the first category are models there for pedagogical purposes, not for testing purposes, like the ones in directory BPA (from the Bayesian Population Analysis book). It'd be nice to at least clean those up. Pretty much everything in misc is just junk I threw up over the years for the manual or just as an example for someone. That can all be eliminated.

How did you generate new data for things like my hierarchical binary pooling example in knitr/pool-binary-trials?

That sounds like a good idea, although in that case maybe the junk models like those in misc could become a separate repository? I'd have to check with @alecksphillips on those specific examples, but as I've said above I imagine it will be compatible rather than curated data.

@PhilClemson - w/r/t to the icar-car-poisson models, I don't think your correction of the stan code is correct. did you test it on an areal map data that contained islands/singletons?

For these I generated the data from the corresponding .R files. I remember for bym2_islands.stan I was getting errors which seemed to be an indexing issue (which I fixed) but I might be wrong. I believe I used the nycTracts10 data.

@alecksphillips
Copy link
Contributor

That sounds like a good idea, although in that case maybe the junk models like those in misc could become a separate repository? I'd have to check with @alecksphillips on those specific examples, but as I've said above I imagine it will be compatible rather than curated data.

Hi all,
Just to add, the bits of data I contributed I generated using the R markdown workbooks that were in those directories.

@mitzimorris
Copy link
Member

For these I generated the data from the corresponding .R files. I remember for bym2_islands.stan I was getting errors which seemed to be an indexing issue (which I fixed) but I might be wrong. I believe I used the nycTracts10 data.

please revert all changes to the car-icar-poisson models and delete the data files you added. I will try to add appropriate datasets myself.

the bym2_islands.stan is correctly coded - it requires input where the areal map contains at least 2 disconnected subcomponents, whereas the data you're using is an areal map which is fully connected. the bym2_islands.stan model is checked into the case study because it's the next level of in complexity for that model, but I haven't yet extended the case study.

I agree that we should expand the suite of test models, but in the icar model case study, the models that consist of a regression plus just an icar or just a random effects are degenerate versions of the bym2 model when there's no spatial correlation or when there's perfect spatial correlation, so adding these to the test suite isn't going to tell you anything new. those models were included mainly to generate visualizations and stan programming pedegogy. they're just not worth running in a test suite - it's a waste of compute cycles / carbon credits.

@PhilClemson
Copy link
Contributor Author

please revert all changes to the car-icar-poisson models and delete the data files you added. I will try to add appropriate datasets myself.

Thanks for the explanation, that makes a lot of sense and I agree. I've reverted the relevant commits.

@jgabry
Copy link
Member

jgabry commented Jun 9, 2020

Thanks for doing all this work!

@mitzimorris Now that the icar-poisson commits are reverted, is this ready to merge?

@mitzimorris
Copy link
Member

yes!

@jgabry
Copy link
Member

jgabry commented Jun 10, 2020

Ok, great, will merge now. Thanks again @PhilClemson for doing all this!

@jgabry jgabry merged commit 506a9ab into stan-dev:master Jun 10, 2020
@PhilClemson
Copy link
Contributor Author

Thanks both, although I can't take all of the credit as I had help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants