赞
踩
I am facing a problem while running spark job using python i.e. pyspark.
Please see below the code snippets
from pyspark.sql import SparkSession
from os.path import abspath
from pyspark.sql.functions import max,min,sum,col
from pyspark.sql import functions as F
spark = SparkSession.builder.appName("test").config("spark.driver.extraClassPath", "/usr/dt/mssql-jdbc-6.4.0.jre8.jar").getOrCreate()
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
spark.conf.set("spark.sql.session.timeZone", "Etc/UTC")
warehouse_loc = abspath('spark-warehouse')
#loading data from MS SQL Server 2017
df = spark.read.format("jdbc").options(url="jdbc:sqlserver://10.90.3.22;DATABASE=TransTrak_V_1.0;user=sa;password=m2m@ipcl1234",properties = { "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver" },dbtable="Current_Voltage").load()
When I run this code, I am facing the following error:
py4j.protocol.Py4JJavaError: An error occurred while calling o38.load.
: java.sql.SQLException: No suitable driver
The same code used to run fine earlier. However, due to some reasons, I had to reinstall centOS 7 again and then Python 3.6. I have set python 3.6 as a default python in spark i.e. when I start pyspark the default python is 3.6.
Just to mention, the system default python is Python 2.7. I am using centOS 7.
What is going wrong here? Can anybody please help on this?
解决方案
Ok, so after long search, it appears that probably spark doesn't work properly with openjdk i.e. java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64. When I see the default Java I see it is as follows
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)
Then I tried to install Oracle JDK 8 from official site, however, then I faced separate issues.
So in nutshell, I am not able to run the spark jobs like earlier.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。