赞
踩
spark使用jdbc格式读取数据内容
要将驱动jar包复制到spark的jars目录下
注意是单节点的spark还是集群的spark
要将jar包复制到每个节点。
加载jar包方法有几个
1.启动spark shell 时,加上 --jars
[root@hadoop01 spark-2.2.0-bin-hadoop2.7]#
bin/spark-shell --jars mysql-connector-java-5.1.7-bin.jar --driver--class-path --jars mysql-connector-java-5.1.7-bin.jar(要写完整路径)
bin/spark-shell --jars /usr/local/spark-2.2.0-bin-hadoop2.7/mysql-connector-java-5.1.7-bin.jar --driver-class-path /usr/local/spark-2.2.0-bin-hadoop2.7/mysql-connector-java-5.1.7-bin.jar
2.使用option配置
val jdbcDF = spark.read.format("jdbc")
.option("driver","com.mysql.jdbc.Driver")
.option("url", "jdbc:mysql//hadoop01:3306/test")
.option("dbtable", "u")
.option("user","root")
.option("password","root").load()
但是最后还是没什么用
scala> val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysql//hadoop01:3306/test").option("dbtable", "u").option("user","root").option("password","root").load()
java.sql.SQLException: No suitable driver
at java.sql.DriverManager.getDriver(DriverManager.java:315)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:83)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
... 48 elided
然后将mysql的jar包cp到spark的jars目录下
但是:再次启动还是同样的错误
val jdbcDF = spark.read.format("jdbc")
.option("driver","com.mysql.jdbc.Driver")
.option("url", "jdbc:mysql//hadoop01:3306/test")
.option("dbtable", "u")
.option("user","root")
.option("password","root").load()
[root@hadoop01 spark-2.2.0-bin-hadoop2.7]#
bin/spark-shell --jars mysql-connector-java-5.1.7-bin.jar
报错:
java.io.FileNotFoundException:
Jar /usr/local/spark-2.2.0-bin-hadoop2.7/mysql-connector-java-5.1.7-bin.jar
not found
发现到sppark的根目录去找jar包 没有到jars目录下找
所以将mysql驱动jar包再次cp到spark根目录下。
成功
注意是112节点 我启动sparkshell是111节点
scala> 19/11/19 09:33:58 ERROR TaskSchedulerImpl: Lost executor 0 on 192.168.37.112: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
所以
我启动了spark集群,然后再启动spark shell,连接的就可能不是本机的spark
而是其他节点的spark
而我其他节点没有mysql驱动
所以
就一直出错 不管我在111节点再怎么搞都没用。
贼坑。
改完后仍然报错!!!!未解决
spark sql 读取jdbc的两种方式 第一种不管怎么改都不行 不知道怎么办???
val jdbcDF = spark.read.format("jdbc").option("driver","com.mysql.jdbc.Driver").option("url", "jdbc:mysql//hadoop01:3306/test").option("dbtable", "u").option("user","root").option("password","root").load()
val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysql//hadoop01:3306/test").option("dbtable", "u").option("user","root").option("password","root").load()
val jdbcDF = spark.read.format("jdbc")
.option("url", "jdbc:mysql//hadoop01:3306/test")
.option("dbtable", "u")
.option("user","root")
.option("password","root")
.load()
val connectionProperties = new java.util.Properties()
connectionProperties.put("user", "root")
connectionProperties.put("password", "root")
val jdbcDF2 = spark.read.jdbc("jdbc:mysql://hadoop01:3306/test", "u", connectionProperties)
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181) Type in expressions to have them evaluated. Type :help for more information. scala> val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysql//hadoop01:3306/test").option("dbtable", "u").option("user","root").option("password","root").load() java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:83) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) ... 48 elided scala> val jdbcDF = spark.read.format("jdbc").option("driver","com.mysql.jdbc.Driver").option("url", "jdbc:mysql//hadoop01:3306/test").option("dbtable", "u").option("user","root").option("password","root").load() java.lang.NullPointerException at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:72) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) ... 48 elided scala> val connectionProperties = new java.util.Properties() connectionProperties: java.util.Properties = {} scala> connectionProperties.put("user", "root") res0: Object = null scala> connectionProperties.put("password", "root") res1: Object = null scala> val jdbcDF2 = spark.read.jdbc("jdbc:mysql://hadoop01:3306/test", "u", connectionProperties) jdbcDF2: org.apache.spark.sql.DataFrame = [id: int, name: string] scala> val jdbcDF = spark.read.format("jdbc") jdbcDF: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@399ac1a3 scala> .option("url", "jdbc:mysql//hadoop01:3306/test") res2: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@399ac1a3 scala> .option("dbtable", "u") res3: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@399ac1a3 scala> .option("user","root") res4: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@399ac1a3 scala> .option("password","root") res5: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@399ac1a3 scala> .load() java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:83) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) ... 48 elided scala>
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。