当前位置:   article > 正文

spark-3.5.1+Hadoop 3.4.0+Hive4.0 分布式集群 安装配置_hadoop+hive+spark搭建

hadoop+hive+spark搭建

Hadoop安装参考:

Hadoop 3.4.0+HBase2.5.8+ZooKeeper3.8.4+Hive4.0+Sqoop 分布式高可用集群部署安装 大数据系列二-CSDN博客

一 下载:

Downloads | Apache Spark

1 下载Maven – Welcome to Apache Maven

  1. # maven安装及配置教程
  2. wget https://dlcdn.apache.org/maven/maven-3/3.8.8/binaries/apache-maven-3.8.8-bin.tar.gz
  3. #
  4. tar zxvf apache-maven-3.8.8-bin.tar.gz
  5. mv apache-maven-3.8.8/ /usr/local/maven
  6. #vi /etc/profile
  7. export MAVEN_HOME=/usr/local/maven
  8. export PATH=$PATH:$MAVEN_HOME/bin
  9. #source /etc/profile
  10. #查看版本
  11. root@slave13 soft]# mvn --version
  12. Apache Maven 3.8.8 (4c87b05d9aedce574290d1acc98575ed5eb6cd39)
  13. Maven home: /usr/local/maven
  14. Java version: 1.8.0_191, vendor: Oracle Corporation, runtime: /usr/local/jdk/jre
  15. Default locale: en_US, platform encoding: UTF-8
  16. OS name: "linux", version: "4.18.0-348.el8.x86_64", arch: "amd64", family: "unix"

2 下载:Scala 2.13.14 | The Scala Programming Language

  1. #解压
  2. tar zxvf scala-2.13.14.tgz
  3. sudo mv scala-2.13.14/ /usr/local/scala
  4. sudo vi /etc/profile
  5. export SCALA_HOME=/usr/local/scala
  6. export PATH=$PATH:$SCALA_HOME/bin
  7. source /etc/profile
  8. #查看版本
  9. scala -version
  10. Scala code runner version 2.13.14 -- Copyright 2002-2024, LAMP/EPFL and Lightbend, Inc.

3  安装spark

  1. #解压
  2. tar zxvf spark-3.5.1-bin-hadoop3.tgz
  3. sudo mv spark-3.5.1-bin-hadoop3/ /usr/local/spark/
  4. #配置环境变量(slave12,slave13同样配置)
  5. sudo vi /etc/profile
  6. export SPARK_HOME=/usr/local/spark
  7. export PATH=$PATH:$SPARK_HOME/bin
  8. export PATH=$PATH:$SPARK_HOME/sbin
  9. source /etc/profile
  10. #配置环境变量
  11. cd /usr/local/spark/conf/
  12. cp spark-env.sh.template spark-env.sh
  13. vim spark-env.sh
  14. export JAVA_HOME=/usr/local/jdk
  15. export SCALA_HOME=/usr/local/scala
  16. export HADOOP_CONF_DIR=/data/hadoop/etc/hadoop/
  17. export SPARK_MASTER_HOST=master11
  18. export SPARK_LIBRARY_PATH=/usr/local/spark/jars
  19. export SPARK_WORKER_MEMORY=2048m
  20. export SPARK_WORKER_CORES=2
  21. export SPARK_MASTER_PORT=7077
  22. export SPARK_MASTER_WEBUI_PORT=8082
  23. export SPARK_DIST_CLASSPATH=$(/data/hadoop/bin/hadoop classpath)
  24. #修改workers配置文件
  25. cp workers.template workers
  26. vim workers
  27. slave12
  28. slave13
  29. #分发文件到slave12,slave13
  30. scp -r /usr/local/spark/ slave12:/usr/local/
  31. scp -r /usr/local/spark/ slave13:/usr/local/
  32. scp -r /usr/local/scala/ slave12:/usr/local/
  33. scp -r /usr/local/scala/ slave13:/usr/local/

二 启动

  1. #master11启动
  2. [root@master11 ~]# /usr/local/spark/sbin/start-all.sh
  1. #报错
  2. Error: A JNI error has occurred, please check your installation and try again
  3. Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
  4. at java.lang.Class.getDeclaredMethods0(Native Method)
  5. at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
  6. at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
  7. at java.lang.Class.getMethod0(Class.java:3018)
  8. at java.lang.Class.getMethod(Class.java:1784)
  9. at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
  10. at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
  11. Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
  12. at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
  13. at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  14. at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
  15. at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  16. ... 7 more
  17. #解决
  18. cd /usr/local/spark/jars/
  19. wget https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.9/slf4j-api-1.7.9.jar
  20. wget https://repo1.maven.org/maven2/org/slf4j/slf4j-nop/1.7.9/slf4j-nop-1.7.9.jar
  1. #启动
  2. [root@master11 ~]# /usr/local/spark/sbin/start-all.sh
  3. starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-master11.out
  4. slave12: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave12.out
  5. slave13: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave13.out
  6. #查看 如下图

 三 Spark 与Hive 集成

1 拷贝配置文件和Mysql 驱动

  1. cp /data/hive/conf/hive-site.xml /usr/local/spark/conf/
  2. cp /data/hadoop/etc/hadoop/hdfs-site.xml /usr/local/spark/conf/
  3. cp /data/hadoop/etc/hadoop/core-site.xml /usr/local/spark/conf/
  4. cp /data/hive/lib/mysql-connector-java-8.0.29.jar /usr/local/spark/jars/

2 登录hive,创建测试表

  1. hive
  2. create database testdb;
  3. use testdb;
  4. create table test(id int,name string) row format delimited fields terminated by ',';
  5. #创建测试文件
  6. cat /root/test.csv
  7. 1,lucy
  8. 2,lili
  9. #导入数据
  10. load data local inpath '/root/test.csv' overwrite into table test;

3 启动 spark-sql

  1. spark-sql --master spark://master11:7077 --executor-memory 512m --total-executor-cores 2 --driver-class-path /usr/local/spark/jars/mysql-connector-java-8.0.29.jar
  2. spark-sql (default)> show databases;
  3. namespace
  4. default
  5. testdb
  6. Time taken: 2.918 seconds, Fetched 2 row(s)
  7. spark-sql (default)> use testdb;
  8. Response code
  9. Time taken: 0.478 seconds
  10. spark-sql (testdb)> show tables;
  11. namespace tableName isTemporary
  12. test
  13. Time taken: 0.454 seconds, Fetched 1 row(s)
  14. spark-sql (testdb)> select * from test;
  15. id name
  16. 1 lcuy
  17. 2 lili
  18. Time taken: 4.126 seconds, Fetched 2 row(s)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/喵喵爱编程/article/detail/1014779
推荐阅读
相关标签
  

闽ICP备14008679号