赞
踩
之前在乐视网的时候组内有同事的挖掘工作用到逻辑回归,最近利用零散时间看了下逻辑回归的原理。主要参考了https://www.cnblogs.com/pinard/p/6029432.html 这篇文章,感觉写的比较清晰。
例子中对K元逻辑回归没有详细推导,我自己推导了一下,过程也比较简单。(太长时间不写字,感觉已经不会拿笔了。。。)
过程如图:
然后运行了一下spark自带的LogisticRegressionWithLBFGSExample例子。
源码如下:
import org.apache.spark.{SparkConf, SparkContext} // $example on$ import org.apache.spark.mllib.classification.{LogisticRegressionModel, LogisticRegressionWithLBFGS} import org.apache.spark.mllib.evaluation.MulticlassMetrics import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.util.MLUtils // $example off$ object LogisticRegressionWithLBFGSExample { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("LogisticRegressionWithLBFGSExample") val sc = new SparkContext(conf) // $example on$ // Load training data in LIBSVM format. val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_libsvm_data.txt") // Split data into training (60%) and test (40%). val splits = data.randomSplit(Array(0.6, 0.4), seed = 11L) val training = splits(0).cache() val test = splits(1) // Run training algorithm to build the model val model = new LogisticRegressionWithLBFGS() .setNumClasses(10) .run(training) // Compute raw scores on the test set. val predictionAndLabels = test.map { case LabeledPoint(label, features) => val prediction = model.predict(features) (prediction, label) } // Get evaluation metrics. val metrics = new MulticlassMetrics(predictionAndLabels) val accuracy = metrics.accuracy println(s"Accuracy = $accuracy") // Save and load model model.save(sc, "target/tmp/scalaLogisticRegressionWithLBFGSModel") val sameModel = LogisticRegressionModel.load(sc, "target/tmp/scalaLogisticRegressionWithLBFGSModel") // $example off$ sc.stop() } } // scalastyle:on println
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。