竞赛提升:必知必会的21篇论文!
如何更好的参与竞赛实践呢?当然是阅读论文了,本文整理了竞赛常见库和模型的论文,涵盖树模型和深度学习模型。
Gradient Boosting
J. Friedman, Greedy Function Approximation: A Gradient Boosting Machine, The Annals of Statistics, Vol. 29, No. 5, 2001.
Friedman, Stochastic Gradient Boosting, 1999
T. Hastie, R. Tibshirani and J. Friedman. Elements of Statistical Learning Ed. 2, Springer, 2009.
Random Forests
Breiman, Random Forests, Machine Learning, 45(1), 5-32, 2001.
P. Geurts, D. Ernst., and L. Wehenkel, Extremely randomized trees, Machine Learning, 63(1), 3-42, 2006.
Regularized Greedy Forest
Rie Johnson and Tong Zhang. Learning Nonlinear Functions Using Regularized Greedy Forest. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5):942-954, May 2014.
XGBoost
Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
LightGBM
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. "LightGBM: A Highly Efficient Gradient Boosting Decision Tree". Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 3149-3157.
Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tie-Yan Liu. "A Communication-Efficient Parallel Algorithm for Decision Tree". Advances in Neural Information Processing Systems 29 (NIPS 2016), pp. 1279-1287.
Huan Zhang, Si Si and Cho-Jui Hsieh. "GPU Acceleration for Large-scale Tree Boosting". SysML Conference, 2018.
CatBoost
Anna Veronika Dorogush, Andrey Gulin, Gleb Gusev, Nikita Kazeev, Liudmila Ostroumova Prokhorenkova, Aleksandr Vorobev "Fighting biases with dynamic boosting". arXiv:1706.09516, 2017.
Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin "CatBoost: gradient boosting with categorical features support". Workshop on ML Systems at NIPS 2017.
Deep Forest
Zhou, Z. H., & Feng, J. (2017). Deep forest. arXiv preprint arXiv:1702.08835.
TabNet
TabNet: Attentive Interpretable Tabular Learning
Transformer
Vaswani A , Shazeer N , Parmar N , et al. Attention Is All You Need. arXiv, 2017.
Bert
Devlin J , Chang M W , Lee K , et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 2018.
prophet
Sean J. Taylor, Benjamin Letham (2018) Forecasting at scale. The American Statistician 72(1):37-45 (https://peerj.com/preprints/3190.pdf).
FTLR
McMahan, H. Brendan, et al. "Ad click prediction: a view from the trenches." ACM SIGKDD g. 2013.
Factorization Machines
Rendle, Steffen. Factorization machines. 2010 IEEE International Conference on Data Mining. IEEE, 2010.
FFM
Juan, Yuchin, et al. Field-aware factorization machines for CTR prediction. Proceedings of the 10th ACM conference on recommender systems. 2016.
DeepFM
Guo, Huifeng, et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. IJCAI. 2017.