赞
踩
所谓的正则效果就是:
数学上具备修补项的某些特性。
讲人话,到底什么是正则化?
就是让我们本科时学过的拉格朗日法求极值得到的解集具有某些特征。
L1:(拉普拉斯分布的指数项)
结果会比较稀疏(接近0,或者很大),
好处是更快的特征学习,让很多W为0
但是正则效果可能不太明显;
L2:(高斯分布的指数项)
L2对于不重要的特征会减小W,但是不会为0
我们应该如何选择L1还是L2?
一般是根据先验分布的不同选择不同的正则化项(其实高斯分布和拉普拉斯分布长得差不多)
Google的说法是:
L1 regularization can’t help with multicollinearity.
L2 regularization can’t help with feature selection.
讲人话:
当你想要抽取规则的时候,L1优先
当你想要特征之间进行线性组合的时候,L2优先
为什么L1具有稀疏性?
这个东西网上几乎没有博客是讲清楚的,
还记得本科时学的拉格朗日不?
这里使用
https://stats.stackexchange.com/questions/45643/why-l1-norm-for-sparse-models
中的一个图来说明:
上面图中的椭圆就是未经正则化的原loss函数,
绿色的就是约束,解最终在绿色的区域的边上产生。
上面有个地方没有讲准确,就是,
这里其实使用的是“广义拉格朗日”(处理不等式约束),
本科时我们学过的是“狭义的拉格朗日(处理等式约束)”
所以L2能产生稀疏解不?也可以,但是概率比较小,因为约束是一个圆圈嘛。
好了,扯了这么多,代码呢?
代码可以使用《python深度学习》第四章的第三个实验
神经网络结构是10000X16X16X1
为了快速出结果,设置epochs=1
L1正则时的权重输出如下:
输出权重 [array([[-4.27828636e-05, -1.00246782e-03, -2.79264990e-04, ..., -3.85033316e-04, -3.40257306e-04, -3.55066732e-08], [ 2.15981118e-02, 3.79165774e-03, -6.72453083e-03, ..., 2.53116563e-02, 4.29332331e-02, -1.85270631e-03], [ 2.85863448e-02, 1.66764148e-02, -6.34254003e-03, ..., 1.40961567e-02, 2.53925007e-02, 1.04373496e-04], ..., [-3.83841514e-04, 5.83783374e-04, -6.23644795e-04, ..., -1.19826291e-04, 1.29003369e-04, -5.33740851e-04], [-6.40181359e-04, 6.27052214e-04, -6.12081552e-04, ..., -9.73617774e-04, -3.70911177e-04, 1.00261578e-03], [-6.49964553e-04, 2.80193461e-04, -4.07341809e-04, ..., 9.82345082e-04, -7.55024375e-04, 7.67573947e-05]], dtype=float32), array([ 0.01352753, 0.00530624, 0.01403685, -0.00621689, 0.00346374, 0.01224227, -0.01186973, 0.00608102, 0.00767745, 0.02525727, 0.00554247, 0.00680919, 0.00823556, 0.02523253, 0.02550968, -0.00733239], dtype=float32), array([[ 2.97359854e-01, 6.56183460e-04, -4.73043948e-01, -1.17567085e-01, -3.42536233e-02, -3.03927213e-01, 4.69646633e-01, -3.84592921e-01, 1.52946264e-01, -1.82628393e-01, 3.07190239e-01, 1.88732699e-01, -3.68719488e-01, -3.30251426e-01, -3.66007872e-02, 4.12766099e-01], [ 1.22636884e-01, 9.78616104e-02, -2.44927496e-01, -3.78500260e-02, -3.29815060e-01, 4.54631686e-01, 1.32869394e-03, -2.15873808e-01, 1.01626828e-01, -1.52611211e-01, -3.60170454e-01, -3.46550457e-02, 3.55746113e-02, 3.10409755e-01, 3.07094425e-01, -3.89622569e-01], [ 4.18785542e-01, 6.49755746e-02, 2.65271336e-01, -1.81596532e-01, -2.55371511e-01, -8.37184638e-02, 4.29974437e-01, -1.55283764e-01, -3.39388162e-01, -3.18841726e-01, -4.97105066e-03, -2.07916439e-01, -1.47543848e-01, -8.37940574e-02, 3.37905467e-01, 3.27208400e-01], [ 1.04914896e-01, -3.00677449e-01, 2.32164890e-01, 1.62189871e-01, -1.11904912e-01, -1.14806369e-02, -3.23227465e-01, -1.23150116e-02, 2.32810229e-01, 2.10369080e-01, 1.51899308e-01, 2.40044445e-01, 1.14793181e-01, 5.89926494e-04, -8.19776803e-02, 7.19810778e-04], [-3.87102336e-01, 3.51326197e-01, 9.02353227e-02, -2.63564795e-01, -3.27613801e-01, -2.86400300e-02, -1.87998384e-01, 3.43739748e-01, 2.73346812e-01, 2.66616821e-01, 3.51429433e-02, 4.56109941e-01, 5.48761450e-02, -3.60661447e-01, -3.88115913e-01, 2.51187414e-01], [-2.37962544e-01, 1.66401789e-01, -3.98593396e-01, 1.65419161e-01, 3.33086133e-01, 4.77736555e-02, 2.00005323e-01, -2.52376407e-01, -2.90598810e-01, -1.85996607e-01, -2.25491524e-02, 1.13793194e-01, 1.65100321e-01, -6.65912463e-04, 5.77541031e-02, -3.25353086e-01], [-1.40810832e-01, -2.48465851e-01, -1.19345643e-01, -8.56471481e-04, -2.67849237e-01, -1.44852057e-01, -9.15314704e-02, 1.34784952e-01, -1.29481718e-01, -1.04500920e-01, -1.77888229e-01, -1.47721738e-01, -2.19401658e-01, -2.23744530e-02, -2.98361719e-01, 1.45486742e-01], [ 1.69071302e-01, -3.72374713e-01, 2.83467352e-01, -1.03206985e-01, 3.67821902e-01, -1.43115878e-01, 1.25592351e-01, -3.89090292e-02, -2.01085940e-01, 1.77833766e-01, -2.91119248e-01, 3.61348659e-01, -3.43382619e-02, -3.96245480e-01, 3.98543626e-01, 4.63600516e-01], [ 4.63620007e-01, -2.45612651e-01, 3.48520666e-01, 1.46613419e-01, 1.65358827e-01, -2.95230269e-01, 4.20761257e-01, -2.00932339e-01, -1.33652672e-01, 2.92670336e-02, -1.22803524e-01, 2.40687251e-01, 3.18130404e-01, -6.91166497e-04, -3.25402856e-01, -1.30906135e-01], [ 1.62128076e-01, -1.19411573e-01, 3.45981359e-01, -3.86496191e-04, 4.05329019e-01, 1.49058387e-01, 4.43916738e-01, 5.18011861e-02, -3.05147499e-01, -3.65549386e-01, -2.54479855e-01, -1.22571457e-02, 1.56393483e-01, 5.07648513e-02, 2.26654470e-01, -3.36109191e-01], [-3.12535584e-01, -2.30290424e-02, 6.98565692e-02, -1.50468856e-01, -2.78825819e-01, 9.92865711e-02, -3.34635884e-01, 3.57187033e-01, 2.54794866e-01, 1.91722021e-01, 5.36262877e-02, 1.83799900e-02, -5.85136586e-04, 3.57504547e-01, -2.61918098e-01, 2.01858550e-01], [ 1.80302829e-01, 3.65201116e-01, 2.03263357e-01, 1.17282532e-01, 1.65266767e-01, -4.04994518e-01, -3.51655126e-01, 3.97830069e-01, -7.66607746e-02, -9.62971300e-02, 1.73393369e-01, -2.00297937e-01, 7.74533255e-04, -1.40481442e-01, 2.14320533e-02, 3.77951324e-01], [-2.46189404e-02, 2.93494880e-01, 3.59376967e-01, 3.20476014e-04, 3.01101089e-01, 3.21090758e-01, -3.75274122e-01, 9.95393726e-04, 2.46108666e-01, 2.64105260e-01, -1.19236402e-01, 3.77319247e-01, 6.48521120e-04, 3.39984924e-01, 2.55425870e-01, 2.54246205e-01], [ 5.12674786e-02, -1.02096912e-03, -3.70046735e-01, -3.52790147e-01, -1.98903963e-01, -1.82327494e-01, 3.54469061e-01, 1.71051875e-01, 3.73468578e-01, 1.66834593e-01, 2.45054252e-07, 4.23564501e-02, 9.42573650e-04, -1.89804733e-02, 4.39227995e-04, 9.95820481e-03], [ 2.57087111e-01, -2.79899389e-01, 2.73097128e-01, -3.69274199e-01, 1.06317475e-01, 3.90571177e-01, 1.57478735e-01, 2.42957503e-01, 4.03050303e-01, -3.74355882e-01, -2.04208896e-01, 1.89841297e-02, -3.78889889e-01, 2.43642956e-01, 2.69247919e-01, 1.17503397e-01], [ 1.01470791e-01, -2.11673021e-01, -1.81737795e-01, 3.25044870e-01, -1.60212040e-01, 2.00224802e-01, 5.87655418e-03, -3.10205370e-01, 2.09311340e-02, -2.71605730e-01, -3.22293550e-01, 7.38748312e-02, 6.16738871e-02, -8.31133649e-02, -1.60038099e-02, -7.09989516e-04]], dtype=float32), array([ 1.9303737e-02, -2.9504906e-02, -2.6506849e-02, 1.6427160e-03, 2.9263936e-02, 5.9134695e-03, 2.8158128e-02, 2.4705507e-02, 1.2207930e-02, -4.5786786e-05, 4.7801528e-05, -2.5754545e-02, 6.4422688e-03, 2.7185101e-02, 2.4097716e-02, 3.2040365e-02], dtype=float32), array([[ 0.33899024], [ 0.09035949], [-0.40873846], [ 0.44341677], [ 0.55033386], [-0.48756945], [ 0.35685787], [-0.34343976], [-0.5151725 ], [-0.15856893], [ 0.01221188], [-0.31893036], [-0.2622632 ], [-0.29004556], [ 0.08608972], [ 0.29274988]], dtype=float32), array([0.02345355], dtype=float32)]
我们可以看到,有很多个权重是e-4,也就是说小于0.1,
所以L1的稀疏性是什么意思呢,
不是网上说的很多权重为0,
而是很多权重接近0.
然后使用同样的代码,再进行L2正则化,然后看下输出的权重
输出权重 [array([[ 5.4510499e-19, -1.1375406e-14, 1.2810104e-09, ..., 6.6694220e-13, 1.7195195e-21, 1.1844387e-18], [ 2.2681307e-02, 2.8639721e-02, -5.0795679e-03, ..., 3.2248314e-02, 3.5097659e-02, 1.9943751e-02], [ 2.9682485e-02, 3.8339857e-02, -3.8643787e-03, ..., -7.3956484e-03, 1.5394368e-02, 6.4378373e-02], ..., [-2.2290610e-03, -5.2631828e-03, -1.1894613e-02, ..., -3.2023021e-03, 9.8316949e-03, 4.5698625e-03], [-6.3545518e-03, -2.8249058e-03, -1.4715969e-02, ..., 3.9712125e-03, -2.1713993e-03, 6.4099208e-05], [ 4.3839724e-03, 7.1036047e-04, -6.1749844e-03, ..., 3.3779065e-03, 3.6998792e-04, -2.5457949e-03]], dtype=float32), array([0.01308027, 0.02083343, 0.01824512, 0.01143504, 0.00499259, 0.01486909, 0.01095271, 0.00404395, 0.02463059, 0.00872665, 0.00801992, 0.00815683, 0.01039271, 0.01561781, 0.01411563, 0.04340347], dtype=float32), array([[ 2.93407321e-01, -1.98400989e-02, -4.40114766e-01, -1.11424305e-01, -7.31087476e-02, -3.04403722e-01, 4.64353442e-01, -3.29900682e-01, 1.63630053e-01, -1.84131727e-01, 3.08276862e-01, 1.94891691e-01, -4.41589683e-01, -3.05707157e-01, -4.41319868e-02, 4.10211772e-01], [ 1.44314080e-01, 1.14107765e-01, -3.18847924e-01, -8.83864984e-02, -2.84857243e-01, 4.43122834e-01, 3.62090170e-02, -2.15172529e-01, 9.76731330e-02, -1.61776304e-01, -3.61122280e-01, -5.60393780e-02, 8.91952366e-02, 3.17961156e-01, 3.24503183e-01, -3.71593475e-01], [ 4.25948858e-01, 7.04302564e-02, 2.81940609e-01, -1.77070782e-01, -2.74864286e-01, -1.06579565e-01, 4.36641574e-01, -1.12034686e-01, -3.45022917e-01, -3.19837213e-01, -1.43970661e-02, -2.16923535e-01, -2.31076464e-01, -8.27731341e-02, 3.60185146e-01, 3.36787492e-01], [ 1.15281835e-01, -3.01896662e-01, 2.39346668e-01, 1.83167055e-01, -1.16130240e-01, -2.12356411e-02, -3.89987141e-01, -2.43074540e-02, 3.24033946e-01, 2.12604478e-01, 1.53906882e-01, 3.26046437e-01, 2.10126624e-01, 8.62302035e-02, -1.64832115e-01, 1.51150580e-02], [-3.90144706e-01, 3.52188319e-01, 2.51630321e-02, -3.43495667e-01, -2.54216045e-01, -3.87258083e-02, -1.94808662e-01, 2.56020427e-01, 2.74487942e-01, 2.68538356e-01, 4.05583121e-02, 4.54750240e-01, 1.47770867e-01, -3.62259477e-01, -3.83709610e-01, 2.63715029e-01], [-2.85299212e-01, 1.69331729e-01, -3.38647544e-01, 2.34549761e-01, 2.80789793e-01, 8.91473368e-02, 1.77124396e-01, -2.13072211e-01, -2.61840492e-01, -1.87434465e-01, -2.91305147e-02, 1.61795199e-01, 2.20224589e-01, 2.58004293e-02, 5.37811071e-02, -3.69709074e-01], [-1.43994346e-01, -2.49894559e-01, -2.04025045e-01, 9.80285257e-02, -2.77742773e-01, -2.26973757e-01, -8.67364928e-02, 1.37876272e-01, -2.02594623e-01, -1.06818587e-01, -1.79687336e-01, -2.55178958e-01, -2.99385250e-01, -5.59285395e-02, -2.94983864e-01, 2.53982246e-01], [ 1.84192955e-01, -3.73058200e-01, 2.97143936e-01, -1.03991538e-01, 2.90101379e-01, -1.68928549e-01, 1.39721408e-01, -1.06037809e-02, -2.18328491e-01, 1.80070952e-01, -2.92274624e-01, 3.65887821e-01, -1.29578754e-01, -3.81524444e-01, 3.92549694e-01, 4.74777371e-01], [ 4.72008854e-01, -2.47345746e-01, 3.59616429e-01, 2.52750129e-01, 1.19450204e-01, -3.10099781e-01, 4.30306435e-01, -1.62968546e-01, -1.40832901e-01, 3.73812504e-02, -1.25185639e-01, 2.44938642e-01, 3.32485586e-01, 3.53299305e-02, -3.30613941e-01, -1.35950983e-01], [ 1.85852900e-01, -9.83352214e-02, 3.41978699e-01, 8.67165811e-03, 3.46845627e-01, 1.29153579e-01, 4.64443177e-01, 1.15139224e-02, -3.25556934e-01, -3.66379201e-01, -2.55769938e-01, -2.08554789e-02, 1.51786834e-01, 5.86780272e-02, 2.55553305e-01, -3.25718910e-01], [-3.11689675e-01, -2.17617527e-02, 7.72571983e-03, -2.31704265e-01, -2.14245647e-01, 9.61495414e-02, -3.27444166e-01, 2.67990142e-01, 2.50948817e-01, 1.95652723e-01, 5.79224415e-02, 9.13931709e-03, 2.56390348e-02, 3.59594494e-01, -2.66508758e-01, 2.20254958e-01], [ 1.94646284e-01, 3.66057843e-01, 2.14457333e-01, 2.24085152e-01, 1.05342574e-01, -4.27612185e-01, -3.49920005e-01, 3.75871032e-01, -1.05346240e-01, -9.78173241e-02, 1.75176308e-01, -2.12537095e-01, -8.19739625e-02, -1.40039310e-01, 7.61785209e-02, 3.92508149e-01], [-4.72818352e-02, 2.94093102e-01, 2.90557832e-01, 1.40494900e-04, 2.79832035e-01, 3.35276634e-01, -3.79788160e-01, -7.71822548e-03, 2.58355170e-01, 2.66037405e-01, -1.21681616e-01, 3.90908629e-01, 6.70772269e-02, 3.55733871e-01, 2.57470012e-01, 2.55136728e-01], [ 3.27138193e-02, -2.80077597e-06, -3.42442542e-01, -4.03933018e-01, -1.91840082e-01, -1.61959320e-01, 3.39495540e-01, 1.39288634e-01, 4.23164964e-01, 1.70850694e-01, -1.47289789e-08, 9.98789370e-02, 7.75708482e-02, -7.56203895e-03, -3.06240357e-02, -6.50095474e-03], [ 2.32675731e-01, -2.35386729e-01, 2.33843103e-01, -4.32860196e-01, 7.58191794e-02, 4.14948165e-01, 1.41167402e-01, 1.70569643e-01, 4.25751954e-01, -3.75422835e-01, -2.05775797e-01, 6.12163655e-02, -3.75813574e-01, 2.67257035e-01, 2.52756864e-01, 1.00201644e-01], [ 1.61049694e-01, -2.18448177e-01, -2.63321877e-01, 4.01718348e-01, -1.76443890e-01, 2.42350683e-01, 7.09695518e-02, -3.05766135e-01, 6.77920505e-02, -2.73409277e-01, -3.23337317e-01, 1.05839022e-01, 1.21519454e-01, -8.19710717e-02, -6.41441718e-02, 1.70101207e-02]], dtype=float32), array([ 0.01596233, -0.00421972, -0.0345025 , 0.03105383, 0.00776252,
我们可以看到,其实也有很多是e-2取值的,但是L2正则化的情况下,你基本看不到e-4的,所以说,
L1比L2“更容易”导致权重的稀疏性,
注意:
并非只有L1能导致稀疏性。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。