当前位置:   article > 正文

Ambari部署_ambari部署单机版spark

ambari部署单机版spark

Ambari(HDP)单机部署

基于【CentOS-7+ Ambari 2.7.0 + HDP 3.0】搭建HAWQ数据仓库01 —— 准备环境,搭建本地仓库,安装ambari - 柒零壹 - 博客园   参考链接

首先,修改服务器主机名为 master01.ambari.com

一、环境准备

操作系统:  centos 7 x86_64.1804

Ambari版本:2.7.0

HDP版本:3.0.0

HAWQ版本:2.3.0

二、安装必备软件(安装yum priorities plugin)

yum install yum-plugin-priorities –y

三、搭建本地仓库:

1、下载软件包:

mkdir -p /home/downloads;cd /home/downloads

wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/3.0.0.0/HDP-GPL-3.0.0.0-centos7-gpl.tar.gz

wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/hdp.repo

wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/HDP-3.0.0.0-1634.xml

wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.0.0/ambari.repo

wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7/HDP-UTILS-1.1.0.22-centos7.tar.gz

wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/HDP-3.0.0.0-centos7-rpm.tar.gz

wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.0.0/ambari-2.7.0.0-centos7.tar.gz

2、搭建本地仓库(安装并开启Apache HTTP服务)

        yum install httpd -y

        systemctl enable httpd

        systemctl start httpd

           mkdir -p /var/www/html

3、创建HDP,HDF子目录

        cd /var/www/html/;mkdir hdp  hdf

4、解开下载的软件包:

          cd /var/www/html

          tar -zxvf  /root/downloads/ambari-2.7.0.0-centos7.tar.gz    -C .

          tar -zxvf  /root/downloads/HDP-3.0.0.0-centos7-rpm.tar.gz   -C ./hdp

          tar -zxvf  /root/downloads/HDP-GPL-3.0.0.0-centos7-gpl.tar.gz  -C ./hdp

          tar -zxvf  /root/downloads/HDP-UTILS-1.1.0.22-centos7.tar.gz   -C ./hdp

5、 修改下载的ambari.repo,

cd /etc/yum.repos.d/;vim ambari.repo

#VERSION_NUMBER=2.7.0.0-897

[ambari-2.7.0.0]
#json.url = http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
name=ambari Version - ambari-2.7.0.0
baseurl=http://
master01.ambari.com/ambari/centos7/2.7.0.0-897
gpgcheck=1
gpgkey=http://
master01.ambari.com/ambari/centos7/2.7.0.0-897/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

6、修改下载的hdp.repo,

cd /etc/yum.repos.d/; vim hdp.repo

#VERSION_NUMBER=3.0.0.0-1634
[HDP-3.0]
name=HDP Version - HDP-3.0.0.0
baseurl=http://
master01.ambari.com/hdp/HDP/centos7/3.0.0.0-1634
gpgcheck=1
gpgkey=http://
master01.ambari.com/hdp/HDP/centos7/3.0.0.0-1634/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[HDP-3.0-GPL]
name=HDP GPL Version - HDP-GPL-3.0.0.0
baseurl=http://
master01.ambari.com/hdp/HDP-GPL/centos7/3.0.0.0-1634
gpgcheck=1
gpgkey=http://
master01.ambari.com/hdp/HDP-GPL/centos7/3.0.0.0-1634/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[HDP-UTILS-1.1.0.22]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.22
baseurl=http://
master01.ambari.com/hdp/HDP-UTILS/centos7/1.1.0.22
gpgcheck=1
gpgkey=http://
master01.ambari.com/hdp/HDP-UTILS/centos7/1.1.0.22/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

四、主节点安装ambari server

1、使用刚才配置好的本地仓库,直接yum命令安装。

        yum install ambari-server -y

五、安装MySQL数据库

1、安装步骤略

2、创建数据库

use mysql;

grant all privileges on *.* to 'ambari'@'%' identified by 'ambari';

grant all privileges on *.* to 'ambari'@'localhost' identified by 'ambari';

flush privileges;

create database ambari;

use ambari;

source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql;

create database hive

grant all privileges on *.* to 'hive'@'%' identified by 'root';

grant all privileges on *.* to 'hive'@'localhost' identified by 'root';

3、把mysql-connector-java.jar包放入/usr/share/java/中

六、配置ambari server

1、执行命令ambari-server setup进行配置

2、执行

ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar

七、启动ambari-server

systemctl enable ambari-server

systemctl start ambari-server

八、所有主机节点,安装ambari-agent,并配置自启动

yum install ambari-agent -y 

systemctl enable ambari-agent 

systemctl restart ambari-agent && systemctl status ambari-agent

九、登录页面进行配置

    master01.ambari.com:8080

十、问题处理

1、spark2.3无法读取hive数据

spark 无法读取hive 3.x的表数据_往溪涧的博客-CSDN博客

大数据定时处理流程(结构化数据)

  • 使用kattle工具或sqoop工具把数据导入hdfs
  • Load数据至hive

eg:1、把load数据语句写入/home/shell_file/sql.txt文件

vim /home/shell_file/sql.txt

load data inpath "/user/Administrator/zhao.txt" overwrite into table bb

2、添加定时任务(指定hive的用户名、密码、执行文件)以达到定时导入目的(或使用Azkaban工具实现任务调度)

0 0 * * *  hive -n hive -p hive -f  /home/shell_file/sql.txt

#或下句

0 0 * * *  spark-sql -f  /home/shell_file/sql.txt

  • 使用spark-sql查询数据,并写入mysql数据库

eg:1、vim /home/shell_file/sparksession_mysql.py

from pyspark import SparkContext, SparkConf

from pyspark.sql import SQLContext

from pyspark.streaming import StreamingContext

#from pyspark.sql import HiveContext

spark = SparkSession.builder.appName("DataFrame").enableHiveSupport().getOrCreate()

url= "jdbc:mysql://127.0.0.1:3306/bigDataTest?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8"

#url='jdbc:oracle:thin:@10.12.4.136:1521:dzjg'

tablename='SPARKSQL_TEST'

properties={"user": "root", "password": "root"}

df=spark.read.jdbc(url=url,table=tablename,properties=properties)

df.registerTempTable("test")

df2=spark.sql("select * from test")

df2=spark.sql("insert overwrite table test select xfsh,xfmc,sum(hsje),from_unixtime(unix_timestamp()) from osp_invoice_detail group by xfsh,xfmc")

sc.stop()

2、添加定时任务以达到定时导入目的(或使用Azkaban工具实现任务调度)

* * * * * /usr/hdp/3.0.0.0-1634/spark2/bin/spark-submit --master yarn --name spark01 /home/shell_file/sparksession_mysql.py

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/很楠不爱3/article/detail/705674
推荐阅读
相关标签
  

闽ICP备14008679号