赞
踩
kylin4.0架构大调整,去除了Hbase,改用hdfs parquet文件作为底层数据存储层,无需指标rowkey编码
计算及查询引擎统一,采用spark,支持spark3.1,所以cube计算查询效率比kylin3.0直接提升一倍,稳定性也高很多
存储的cube计算数据占用空间比hbase少一倍
简单查询跟3.x版本性能差不多,但对应复杂查询,性能成倍数提升(parquet及目录分区过滤)
综上,随着kylin4.0稳定版本发布,没有不升级的理由
官方文档:https://kylin.apache.org/cn/docs/
点击查看版本支持,为啥用hadoop2.10不用3.x,因为试过了kylin和hive3.x版本不兼容!!!,官方kylin环境hive最高支持到2.3.9,而hive2.x与hadoop2.x对应
下载当前最新版 apache-kylin-4.0.0-bin-spark3.tar.gz
解压至服务器安装目录
以上软件服务都要提前安装部署好,这里不多bb
配置KYLIN_HOME环境变量,HBASE_HOME可忽略
/etc/profile
export JAVA_HOME=/opt/jdk1.8.0_301
export MAVEN_HOME=/opt/apache-maven-3.8.2
export SCALA_HOME=/opt/scala-2.12.14
export HADOOP_HOME=/opt/hadoop-2.10.1
export HIVE_HOME=/opt/apache-hive-2.3.9-bin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_HOME=/opt/spark-3.1.2-bin-hadoop2.7
export KYLIN_HOME=/opt/apache-kylin-4.0.0-bin-spark3
export HBASE_HOME=/opt/hbase-2.2.3
export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin:$KYLIN_HOME/bin:$HBASE_HOME/bin
source /etc/profile
启动依赖的软件服务
配置kylin
vim $KYLIN_HOME/conf/kylin.properties
h1是我虚拟机主机名
#kylin mysql metadata元数据库配置 kylin.metadata.url=kylin_metadata@jdbc,url=jdbc:mysql://h1:3306/kylin,username=hive,password=hive,maxActive=10,maxIdle=10 #zookeeper配置 kylin.env.zookeeper-connect-string=h1 kylin.server.cluster-servers=h1:7070 #计算引擎默认资源配置 kylin.engine.spark-conf.spark.master=yarn kylin.engine.spark-conf.spark.submit.deployMode=client kylin.engine.spark-conf.spark.yarn.queue=default kylin.engine.spark-conf.spark.executor.cores=1 kylin.engine.spark-conf.spark.executor.memory=512M kylin.engine.spark-conf.spark.executor.instances=1 kylin.engine.spark-conf.spark.executor.memoryOverhead=256M kylin.engine.spark-conf.spark.driver.cores=1 kylin.engine.spark-conf.spark.driver.memory=512M kylin.engine.spark-conf.spark.driver.memoryOverhead=256M #查询引擎默认资源配置 kylin.query.auto-sparder-context-enabled-enabled=true kylin.query.sparder-context.app-name=kylin_query kylin.query.spark-conf.spark.master=yarn kylin.query.spark-conf.spark.submit.deployMode=client kylin.query.spark-conf.spark.yarn.queue=default kylin.query.spark-conf.spark.driver.cores=1 kylin.query.spark-conf.spark.driver.memory=512M kylin.query.spark-conf.spark.driver.memoryOverhead=256M kylin.query.spark-conf.spark.executor.cores=1 kylin.query.spark-conf.spark.executor.instances=1 kylin.query.spark-conf.spark.executor.memory=1G kylin.query.spark-conf.spark.executor.memoryOverhead=256M
上传mysql-connector-java-8.0.26.jar连接驱动包到$KYLIN_HOME/ext/目录,目录没有自己创建,包自己到maven中央仓库页面下载
mysql中创建kylin.metadata.url指定的database,同时创建用户和数据库授权,不多bb
环境检测:执行脚本:$KYLIN_HOME/bin/check-env.sh
rose@h1:/opt/apache-kylin-4.0.0-bin-spark3/ext
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。