赞
踩
因为最终的结果只有yes和no两种,判断是否打高尔夫球所需的信息量(熵、不确定性)是1 bit。构建决策树的过程就是通过各种天气特征,来消除不确定性(使熵减少)。@relation weather.symbolic @attribute outlook {sunny, overcast, rainy} @attribute temperature {hot, mild, cool} @attribute humidity {high, normal} @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny,hot,high,FALSE,no sunny,hot,high,TRUE,no overcast,hot,high,FALSE,yes rainy,mild,high,FALSE,yes rainy,cool,normal,FALSE,yes rainy,cool,normal,TRUE,no overcast,cool,normal,TRUE,yes sunny,mild,high,FALSE,no sunny,cool,normal,FALSE,yes rainy,mild,normal,FALSE,yes sunny,mild,normal,TRUE,yes overcast,mild,high,TRUE,yes overcast,hot,normal,FALSE,yes rainy,mild,high,TRUE,no
某些子集在分割后变得更加纯净了,如当 outlook = overcast 的时候,全部为yes,该子集的熵为0,使得总体的熵(各个子集熵的平均值)减少。sunny,hot,high,FALSE,no sunny,hot,high,TRUE,no sunny,mild,high,FALSE,no sunny,cool,normal,FALSE,yes sunny,mild,normal,TRUE,yes overcast,hot,high,FALSE,yes overcast,cool,normal,TRUE,yes overcast,mild,high,TRUE,yes overcast,hot,normal,FALSE,yes rainy,mild,high,FALSE,yes rainy,cool,normal,FALSE,yes rainy,cool,normal,TRUE,no rainy,mild,normal,FALSE,yes rainy,mild,high,TRUE,no
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。