赞
踩
Ambari(HDP)单机部署
基于【CentOS-7+ Ambari 2.7.0 + HDP 3.0】搭建HAWQ数据仓库01 —— 准备环境,搭建本地仓库,安装ambari - 柒零壹 - 博客园 参考链接
首先,修改服务器主机名为 master01.ambari.com
一、环境准备
操作系统: centos 7 x86_64.1804
Ambari版本:2.7.0
HDP版本:3.0.0
HAWQ版本:2.3.0
二、安装必备软件(安装yum priorities plugin)
yum install yum-plugin-priorities –y
三、搭建本地仓库:
1、下载软件包:
mkdir -p /home/downloads;cd /home/downloads
wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/3.0.0.0/HDP-GPL-3.0.0.0-centos7-gpl.tar.gz
wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/hdp.repo
wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/HDP-3.0.0.0-1634.xml
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.0.0/ambari.repo
wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7/HDP-UTILS-1.1.0.22-centos7.tar.gz
wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/HDP-3.0.0.0-centos7-rpm.tar.gz
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.0.0/ambari-2.7.0.0-centos7.tar.gz
2、搭建本地仓库(安装并开启Apache HTTP服务)
yum install httpd -y
systemctl enable httpd
systemctl start httpd
mkdir -p /var/www/html
3、创建HDP,HDF子目录
cd /var/www/html/;mkdir hdp hdf
4、解开下载的软件包:
cd /var/www/html
tar -zxvf /root/downloads/ambari-2.7.0.0-centos7.tar.gz -C .
tar -zxvf /root/downloads/HDP-3.0.0.0-centos7-rpm.tar.gz -C ./hdp
tar -zxvf /root/downloads/HDP-GPL-3.0.0.0-centos7-gpl.tar.gz -C ./hdp
tar -zxvf /root/downloads/HDP-UTILS-1.1.0.22-centos7.tar.gz -C ./hdp
5、 修改下载的ambari.repo,
cd /etc/yum.repos.d/;vim ambari.repo
#VERSION_NUMBER=2.7.0.0-897
[ambari-2.7.0.0]
#json.url = http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
name=ambari Version - ambari-2.7.0.0
baseurl=http:// master01.ambari.com/ambari/centos7/2.7.0.0-897
gpgcheck=1
gpgkey=http:// master01.ambari.com/ambari/centos7/2.7.0.0-897/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
6、修改下载的hdp.repo,
cd /etc/yum.repos.d/; vim hdp.repo
#VERSION_NUMBER=3.0.0.0-1634
[HDP-3.0]
name=HDP Version - HDP-3.0.0.0
baseurl=http:// master01.ambari.com/hdp/HDP/centos7/3.0.0.0-1634
gpgcheck=1
gpgkey=http:// master01.ambari.com/hdp/HDP/centos7/3.0.0.0-1634/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-3.0-GPL]
name=HDP GPL Version - HDP-GPL-3.0.0.0
baseurl=http:// master01.ambari.com/hdp/HDP-GPL/centos7/3.0.0.0-1634
gpgcheck=1
gpgkey=http:// master01.ambari.com/hdp/HDP-GPL/centos7/3.0.0.0-1634/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-UTILS-1.1.0.22]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.22
baseurl=http:// master01.ambari.com/hdp/HDP-UTILS/centos7/1.1.0.22
gpgcheck=1
gpgkey=http:// master01.ambari.com/hdp/HDP-UTILS/centos7/1.1.0.22/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
四、主节点安装ambari server
1、使用刚才配置好的本地仓库,直接yum命令安装。
yum install ambari-server -y
五、安装MySQL数据库
1、安装步骤略
2、创建数据库
use mysql;
grant all privileges on *.* to 'ambari'@'%' identified by 'ambari';
grant all privileges on *.* to 'ambari'@'localhost' identified by 'ambari';
flush privileges;
create database ambari;
use ambari;
source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql;
create database hive;
grant all privileges on *.* to 'hive'@'%' identified by 'root';
grant all privileges on *.* to 'hive'@'localhost' identified by 'root';
3、把mysql-connector-java.jar包放入/usr/share/java/中
六、配置ambari server
1、执行命令ambari-server setup进行配置
2、执行
ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
七、启动ambari-server
systemctl enable ambari-server
systemctl start ambari-server
八、所有主机节点,安装ambari-agent,并配置自启动
yum install ambari-agent -y systemctl enable ambari-agent systemctl restart ambari-agent && systemctl status ambari-agent
九、登录页面进行配置
master01.ambari.com:8080
十、问题处理
1、spark2.3无法读取hive数据
spark 无法读取hive 3.x的表数据_往溪涧的博客-CSDN博客
大数据定时处理流程(结构化数据)
eg:1、把load数据语句写入/home/shell_file/sql.txt文件
vim /home/shell_file/sql.txt
load data inpath "/user/Administrator/zhao.txt" overwrite into table bb
2、添加定时任务(指定hive的用户名、密码、执行文件)以达到定时导入目的(或使用Azkaban工具实现任务调度)
0 0 * * * hive -n hive -p hive -f /home/shell_file/sql.txt
#或下句
0 0 * * * spark-sql -f /home/shell_file/sql.txt
eg:1、vim /home/shell_file/sparksession_mysql.py
from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext
from pyspark.streaming import StreamingContext
#from pyspark.sql import HiveContext
spark = SparkSession.builder.appName("DataFrame").enableHiveSupport().getOrCreate()
url= "jdbc:mysql://127.0.0.1:3306/bigDataTest?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8"
#url='jdbc:oracle:thin:@10.12.4.136:1521:dzjg'
tablename='SPARKSQL_TEST'
properties={"user": "root", "password": "root"}
df=spark.read.jdbc(url=url,table=tablename,properties=properties)
df.registerTempTable("test")
df2=spark.sql("select * from test")
df2=spark.sql("insert overwrite table test select xfsh,xfmc,sum(hsje),from_unixtime(unix_timestamp()) from osp_invoice_detail group by xfsh,xfmc")
sc.stop()
2、添加定时任务以达到定时导入目的(或使用Azkaban工具实现任务调度)
* * * * * /usr/hdp/3.0.0.0-1634/spark2/bin/spark-submit --master yarn --name spark01 /home/shell_file/sparksession_mysql.py
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。