boosting
几种常见的boosting
- Boosting
很多时候单一模型不够稳定或者得出的结果不够好,需要进行模型集成(assemble),即用多种模型进行预测。集成方法分为bagging和 boosting两种。以下介绍boosting.
You can view Boosting as a linear regression combination of many models
FmX=a0f0(X)+a1f1(X)+…+amfm(X)
It is stage-wise optimized algorithm
Learn F0, then F1 F2…
Emphases ERROR on each iteration
L(FmX,Y)< L(Fm-1X,Y) L means loss function
ADA Boost
Emphases error by changing the distribution of samples.根据样本误差大小,分配权重。误差大的新权重大。
Gradient Boosting
Emphases error by changing train target.新的label是上一次预测的残差。
- Gradient Boosting
Basic Function:
FX=m=0Mfm(X)
f(X) is the base learner, and we use decision tree in GBDT
How to learn?
·Greedy way:
·FmX=Fm-1X+fm(X)
·Let L(y, FmX)<L(y, Fm-1X+fm)
·Gradient descent
·Get the negative gradient first
·ŷi=-ƏFm-1xil(Fm-1xi,yi)
·Learn from fmX to fit Ŷ by using L2 Loss
·fmX=arg minf(X)i=1n(fxi-ŷi)^2
- GBDT
GBDT=Gradient Boost + Decision Tree
Supported Tasks: Regression, Classification, Ranking
四、LightGBM
LightGBM是微软2017年开源的一种基于决策树的机器学习模型。LightGBM is a gradient boosting framework. It is designed to be distributed and efficient with following advantages:
~Fast training speed and high efficiency
~Lower memory usage
~Better accuracy
~parallel learning supported
~Capacity of handling large-scaling data
~Support categorical feature directly
相关阅读
Bagging和Boosting 概念及区别 Bagging和Boosting都是将已有的分类或回归算法通过一定方式组合起来,形成一个性能更加强大的分类器