赞
踩
补充编译
hbase-connectors-master
源码下载
wget https://github.com/apache/hbase-connectors
mvn --settings /Users/admin/Documents/softwares/repository-zi/settings-aliyun.xml -Dspark.version=3.3.1 -Dscala.version=2.12.10 -Dscala.binary.version=2.12 -Dhbase.version=2.4.9 -Dhadoop-three.version=3.2.0 -DskipTests clean package
pom.xml添加配置
<repositories>
<repository>
<id>central</id>
<name>Maven Repository</name>
<url>https://repo.maven.apache.org/maven2</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
1、导出对应的jar包
cp /opt/cloudera/parcels/CDH/lib/hbase/hbase-shaded-netty-2.2.1.jar /opt/cloudera/parcels/CDH/lib/spark3/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/hbase-shaded-protobuf-2.2.1.jar /opt/cloudera/parcels/CDH/lib/spark3/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol-shaded-2.1.0-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark3/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/hbase-shaded-miscellaneous-2.2.1.jar /opt/cloudera/parcels/CDH/lib/spark3/jars/
cp hbase-spark-1.0.1-SNAPSHOT.jar /opt/cloudera/parcels/CDH/lib/spark3/jars/
2、scala脚本测试
spark-shell -c spark.ui.port=11111
import org.apache.hadoop.hbase.spark.HBaseContext
import org.apache.hadoop.hbase.HBaseConfiguration
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "hadoop103:2181")
new HBaseContext(spark.sparkContext, conf)
val hbaseDF = (spark.read.format("org.apache.hadoop.hbase.spark")
.option("hbase.columns.mapping",
"empno STRING :key,ename STRING info:ename, job STRING info:job"
).option("hbase.table", "hive_hbase_emp_table")
).load()
hbaseDF.show()
3、pyspark脚本测试hbase-connectors
import findspark
findspark.init(spark_home='/opt/cloudera/parcels/CDH/lib/spark3',python_path='/opt/cloudera/anaconda3/bin/python')
from pyspark.sql import SparkSession
spark = SparkSession.Builder().appName("Demo_spark_conn").getOrCreate()
df = spark.read.format("org.apache.hadoop.hbase.spark").option("hbase.zookeeper.quorum","hadoop103:2181").option("hbase.columns.mapping","empno STRING :key, ename STRING info:ename,job STRING info:job").option("hbase.table", "hive_hbase_emp_table").option("hbase.spark.use.hbasecontext", False).load()
df.show(10)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。