:17: error: not found: value spark val df3 = spark.read.json("/usr/l...">
赞
踩
scala> val df3 = spark.read.json("/usr/local/tmp_files/people.json")
<console>:17: error: not found: value spark
val df3 = spark.read.json("/usr/local/tmp_files/people.json")
^ ^scala> import org.apache.spark
import org.apache.sparkscala> var df5=spark.read.json("/usr/local/tmp_files/people.json")
<console>:18: error: object read is not a member of package org.apache.spark
var df5=spark.read.json("/usr/local/tmp_files/people.json")
查询,参考 spark.read is not a member of package org.apache.spark, 根据以下话语判断可能是在和SparkSession
有关。查看spark-shell启动时候有没有问题。
You need an instance of
SparkSession
that is often calledspark
(including spark-shell). See this tutorial:So
read
is not a method in a package object but a method in classSparkSession
[root@bigdata111 bin]# ./spark-shell --master spark://bigdata111:7077
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95) at $line3.$read at $line3.$read at scala.tools.nsc.interpreter.IMain$WrappedRequest at org.apache.spark.repl.SparkILoop at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105) at scala.tools.nsc.interpreter.ILoop at scala.tools.nsc.interpreter.ILoop at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem at scala.Predef$.require(Predef.scala:224) at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91) at org.apache.spark.SparkContext.<init>(SparkContext.scala:524) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313) at org.apache.spark.sql.SparkSession$Builder at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95) ... 47 elided <console>:14: error: not found: value spark import spark.implicits._ ^ <console>:14: error: not found: value spark import spark.sql ^ Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.1.0 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_144) Type in expressions to have them evaluated. Type :help for more information. |
参考 Can only call getServletHandlers on a running MetricsSystem 发现spark 集群的 worker 都挂掉了,重启了spark 集群。然后重启了。
但是当我们重启了worker之后,虽然在控制台用jps命令查看bigdata1112有运行的任务:worker。但是在spark集群的UI界面,却没有workerid。
20/03/11 04:53:54 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 20/03/11 04:53:54 INFO ClientCnxn: Opening socket connection to server bigdata112/192.168.32.112:2181. Will not attempt to authenticate using SASL (unknown error) |
对于这个错误,在查询了相关信息之后和参考:spark集群启动后WorkerUI界面看不到Workers解决,然后发现和两个因素有关系:zookeeper和spark/conf/spark-env.sh的文件配置
我的bigdata111和bigdata112的spark-env.sh文件配置相同,如下:
这样的配置的原因是:我曾经配置过Spark HA。所有注释了SPARK_MASTER_HOST和SPARK_MASTER_PORT,而添加了SPARK_DAEMON_JAVA_OPTS。
export JAVA_HOME=/opt/module/jdk1.8.0_144
#export SPARK_MASTER_HOST=bigdata111
#export SPARK_MASTER_PORT=7077
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=bigdata111:2181,bigdata112:2181,bigdata113:2181 -Dspark.deploy.zookeeper.dir=/spark"
对于我的问题,我的原因:是spark的spark-env.sh文件配置的问题。我由于以前用过Spark HA,所以一直没有改配置
解决:先启动zookeeper集群,再启动spark集群,spark-shell正常启动。
其他:如果你没有像我一样配置Spark HA,就是没有在spark-env.sh文件中注释SPARK_MASTER_HOST和SPARK_MASTER_PORT,那你就需要检查他们的值是否正确。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。