赞
踩
位于ml/tree/impl/目录下。mllib目录下的随机森林算法也是调用的ml下的RandomForest。ml是mllib的最新实现,将来是要替换掉mllib库的。
RandomForest核心代码
train方法
每次迭代将要计算的node推入堆栈,选择参与计算的抽样数据,计算该节点,循环该过程。
while(nodeStack.nonEmpty) {
// Collect some nodes to split, and choose features for each node (if subsampling).
// Each group of nodes may come from one or multiple trees, and at multiple levels.
val(nodesForGroup,treeToNodeToIndexInfo) =
RandomForest.selectNodesToSplit(nodeStack,maxMemoryUsage,metadata,rng)
// Sanity check (should never occur):
assert(nodesForGroup.nonEmpty,
s"RandomForest selected empty nodesForGroup. Error for unknown reason.")
// Only send trees to worker if they contain nodes being split this iteration.
val topNodesForGroup: Map[Int, LearningNode] =
nodesForGroup.keys.map(treeIdx => treeIdx ->
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。