My participation to https://www.kaggle.com/c/ieee-fraud-detection
The notebooks are ordered from 0 to 6, following my progression through the competition
- nb0 splits the data in train/valid/test
- nb1 is a quick and dirty first modeling approach, to create a baseline
- nb2 explores a validation scheme specific to the data we have in this competition
- nb3 uses this scheme to perform feature selection on the hundreds of features we have
- nb4 investigates the variables our feature importance highlighted, to get a sense of why these would matter
- nb5 here we try to create additional features
- nb6 perform a Kfold ensembling of lightgbm models