-
-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed models / data files for cmdstan #171
Conversation
…tting 'Initialization between (-2, 2) failed after 100 attempts' errors.
Added data files for pool-binary-trials
Added missing data for knitr/car-iar-poisson models
Added missing data for golf1 models
…t-regression\ and knitr\chapter1"
where did you get the data from? it's important that the data used to test the model should be generated by the model or the process which the model is trying to approximate. using just the specified constraints on data and parameters to generate test is not enough - there is often domain knowledge which is not specified, e.g., for the icar models, the input data describes an areal map, and there are constraints on the relations between the map edges which aren't directly expressed in the data declarations. for testing purposes, I would imagine that we're interested in testing well-specified models which are fitting data generated from the model and/or which is somehow pathological. this requires that all test data be annotated explaining what kind of data it is and what kind of behavoir we expect when fitting the model to the data. |
hi @PhilClemson - please allow me to clarify: I think it would be better to curate the models used for testing, not make sure that we can use every single one of them. so I don't think this PR is a good idea. |
I also think a curated set of test models makes much more sense for testing than all the stuff that's accumulated in our Having said that, I would like to see something done to clean up
In the first category are models there for pedagogical purposes, not for testing purposes, like the ones in directory How did you generate new data for things like my hierarchical binary pooling example in |
@PhilClemson - w/r/t to the icar-car-poisson models, I don't think your correction of the stan code is correct. did you test it on an areal map data that contained islands/singletons? |
Hi @mitzimorris, thanks for taking a look through this. Most of the data came from the .R and documentation found in the same folder as the stan files. For the BPA models I downloaded the data from the links provided. My main task was to standardise the models so that they could all be run using cmdstan. Fortunately in most cases it was just a case of generating the data.R file from existing code / data. There were a few cases where we had to generate "compatible" data but not "curated" data. The knitr models are a good example of this.
I understand, we just thought we would share these changes in the hope that they could be applied in general. We've found that having a large selection of models / data is useful, even if just for identifying bugs. It would definitely not be appropriate to judge performance statistics based on individual models, although looking at performance over a large population of models might still be useful.
That sounds like a good idea, although in that case maybe the junk models like those in misc could become a separate repository? I'd have to check with @alecksphillips on those specific examples, but as I've said above I imagine it will be compatible rather than curated data.
For these I generated the data from the corresponding .R files. I remember for bym2_islands.stan I was getting errors which seemed to be an indexing issue (which I fixed) but I might be wrong. I believe I used the nycTracts10 data. |
Hi all, |
please revert all changes to the car-icar-poisson models and delete the data files you added. I will try to add appropriate datasets myself. the bym2_islands.stan is correctly coded - it requires input where the areal map contains at least 2 disconnected subcomponents, whereas the data you're using is an areal map which is fully connected. the bym2_islands.stan model is checked into the case study because it's the next level of in complexity for that model, but I haven't yet extended the case study. I agree that we should expand the suite of test models, but in the icar model case study, the models that consist of a regression plus just an icar or just a random effects are degenerate versions of the bym2 model when there's no spatial correlation or when there's perfect spatial correlation, so adding these to the test suite isn't going to tell you anything new. those models were included mainly to generate visualizations and stan programming pedegogy. they're just not worth running in a test suite - it's a waste of compute cycles / carbon credits. |
Thanks for the explanation, that makes a lot of sense and I agree. I've reverted the relevant commits. |
Thanks for doing all this work! @mitzimorris Now that the icar-poisson commits are reverted, is this ready to merge? |
yes! |
Ok, great, will merge now. Thanks again @PhilClemson for doing all this! |
Thanks both, although I can't take all of the credit as I had help! |
This basically fixes the majority of models that were either broken or had missing / incorrect syntax for the R dump data files used by cmdstan. They can be checked using the script found in the performance tests repository (https://github.com/stan-dev/performance-tests-cmdstan), e.g.:
python ./performance-tests-cmdstan/runPerformanceTests.py directories "performance-tests-cmdstan/example-models" --overwrite-gold --method=sample --num-samples=1000
Here is a list of the updated models:
misc/ecology/occupancy/occupancy
misc/ecology/mark-recapture/mark-recapture
misc/ecology/mark-recapture/cjs-K
misc/ecology/mark-recapture/mark-recapture-2
misc/ecology/mark-recapture/mark-recapture-3
misc/ecology/mark-recapture/cjs
misc/linear-regression_regression
misc/linear-regression_regression_std
misc/garch_arch1
misc/garch_koyck
misc/garch_garch1_1
misc/hier_multivariate_hier_multivariate
misc/moving-avg/arma11
misc/moving-avg/arma11_alt
misc/moving-avg/maQ
misc/moving-avg/stochastic-volatility
misc/moving-avg/ma2
misc/moving-avg/stochastic-volatility-optimized
misc/gaussian-process/gp-predict-analytic
misc/gaussian-process/gp-sim-multi
misc/gaussian-process/gp-fit-ARD
misc/gaussian-process/gp-fit-latent
misc/dlm/fx/equicorr
misc/dlm/fx/factor
misc/sur/sur/imrpoper
misc/sur/sur
misc/hmm/hmm
misc/hmm/hmm-analytic
misc/hmm/hmm-semisup
misc/hmm/hmm-sufficient
misc/nnmf/nnmf
misc/nnmf/nnmf/vec
knitr/simplest-regression/fake-data
knitr/pool-binary-trials/pool
knitr/pool-binary-trials/no-pool
knitr/pool-binary-trials/hier
knitr/pool-binary-trials/hier-logit-centered
knitr/pool-binary-trials/hier-logit
knitr/chapter2/worldcup/fixed
knitr/chapter2/worldcup/first/try
knitr/chapter2/worldcup/first/try/noprior
knitr/chapter2/worldcup/with/replication
knitr/chapter2/golf1
knitr/chapter2/worldcup/no/sqrt
knitr/car-iar-poisson/pois/re
knitr/car-iar-poisson/pois
knitr/car-iar-poisson/bym2/islands
knitr/car-iar-poisson/bym2/offset/only
knitr/car-iar-poisson/pois/icar
knitr/car-iar-poisson/bym2
knitr/car-iar-poisson/simple/iar
knitr/chapter1/fake-data
knitr/golf/golf1
knitr/world-cup/worldcup/fixed
knitr/world-cup/worldcup/first/try
knitr/world-cup/worldcup/first/try/noprior
knitr/world-cup/worldcup/with/replication
knitr/world-cup/worldcup/no/sqrt
ARM/Ch.25/earnings2
ARM/Ch.25/earnings
ARM/Ch.25/earnings/pt2
ARM/Ch.25/earnings/pt1
education/tutorial/twopl/reg/dif
education/tutorial/twopl/reg/noncentered
education/tutorial/twopl/reg/centered
education/tutorial/twopl/twopl
education/hierarchical/2pl/hierarchical/2pl
education/dina/independent/dina/independent
education/dina/independent/dina/nostructure
BPA/Ch.04/GLMM1
BPA/Ch.04/GLMM2
BPA/Ch.13/bluebug
BPA/Ch.13/owls/ms1
BPA/Ch.13/owls/ms2
BPA/Ch.13/Dynocc2
BPA/Ch.10/js/ms
BPA/Ch.12/binmix/cov
BPA/Ch.12/binmix
BPA/Ch.12/Nmix0
BPA/Ch.12/Nmix2
BPA/Ch.12/Nmix1
BPA/Ch.07/cjs/trap
BPA/Ch.07/cjs/temp/corr