当前位置:   article > 正文

StratifiedKFold 和 KFold 的比较_stratifykfolf

stratifykfolf
  • 将全部训练集S分成k个不相交的子集,假设S中的训练样例个数为m,那么每一个自己有m/k个训练样例,相应的子集为{s1,s2,...,sk}

  • 每次从分好的子集里面,拿出一个作为测试集,其他k-1个作为训练集

  • 在k-1个训练集上训练出学习器模型,把这个模型放到测试集上,得到分类率的平均值,作为该模型或者假设函数的真实分类率


StratifiedKFold用法类似Kfold,但是他是分层采样,确保训练集,测试集中各类别样本的比例与原始数据集中相同

Parameters

  • n_splits : int, default=3

Number of folds. Must be at least 2.

  • shuffle : boolean, optional

Whether to shuffle each stratification of the data before splitting into batches.

  • random_state :

int, RandomState instance or None, optional, default=None

If int, random_state is the seed used by the random number generatorIf RandomState instance, random_state is the random number generator;

If None, the random number generator is the RandomState instance used

by `np.random`. Used when ``shuffle`` == True.

  1. import numpy as np
  2. from sklearn.model_selection import KFold,StratifiedKFold
  3. X=np.array([
  4. [1,2,3,4],
  5. [11,12,13,14],
  6. [21,22,23,24],
  7. [31,32,33,34],
  8. [41,42,43,44],
  9. [51,52,53,54],
  10. [61,62,63,64],
  11. [71,72,73,74]
  12. ])
  13. y=np.array([1,1,0,0,1,1,0,0])
  14. sfolder=StratifiedKFold(n_splits=4,random_state=0,shuffle=False)
  15. floder = KFold(n_splits=4,random_state=0,shuffle=False)
  16. for train, test in sfolder.split(X,y):
  17. print('Train: %s | test: %s' % (train, test))
  18. print(" ")
  19. for train, test in floder.split(X,y):
  20. print('Train: %s | test: %s' % (train, test))
  21. print(" ")

StratifiedKFold                                                                             KFold

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/空白诗007/article/detail/954043
推荐阅读
相关标签
  

闽ICP备14008679号