赞
踩
进入到zookeeper查看是否有/atsv2-hbase-secure/meta-region-server
文件
su - zookeeper
kinit -kt /etc/security/keytabs/zk.service.keytab zookeeper/bg6.test.com.cn@HADOOP.COM
sh /usr/hdp/3.1.0.0-78/zookeeper/bin/zkCli.sh -server bg6.test.com.cn:2181
查看zookeeper的目录结构,可以看出并没有atsv2-hbase-secure
,那报错就是必然的。
[zk: bg6.test.com.cn:2181(CONNECTED) 0] ls /
[hive, cluster, brokers, infra-solr, kafka-acl, kafka-acl-changes, admin, isr_change_notification, log_dir_event_notification, kafka-acl-extended, rmstore, kafka-acl-extended-changes, consumers, latest_producer_id_block, hbase, registry, controller, zookeeper, delegation_token, hiveserver2, controller_epoch, hiveserver2-leader, kafka-manager, ambari-metrics-cluster, apache_atlas, config, kylin]
ambari 集群 遇到的一些问题,这篇文章的困惑在于hdfs dfs -mv /atsv2/hbase/tmp/
这路径没有
su - yarn
kinit -kt /etc/security/keytabs/yarn.service.keytab yarn/bg3.test.com.cn@HADOOP.COM
yarn app -list
-bash-4.2$ yarn app -list
21/02/03 11:11:39 INFO client.RMProxy: Connecting to ResourceManager at bg4.test.com.cn/10.128.2.171:8050
21/02/03 11:11:39 INFO client.AHSProxy: Connecting to Application History server at bg3.test.com.cn/10.128.2.121:10200
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):2
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1611552872177_1446 SparkSQL::10.128.2.210 SPARK hbase default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4041
application_1611552872177_0144 Thrift JDBC/ODBC Server SPARK spark default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4040
既然从上面看到就没有ats-hbase
,接着切换到yarn-ats
目录,从下面可以看到依旧没有ats-hbase
,
按照Remove ats-hbase before switching between clusters,仍然没法执行destroy,因为就没有ats-hbase
这个程序。
su - yarn-ats
kinit -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab yarn-ats-test_data@HADOOP.COM
[yarn-ats@bg7 ~]$ yarn app -list
21/02/03 11:21:44 INFO client.RMProxy: Connecting to ResourceManager at bg4.test.com.cn/10.128.2.171:8050
21/02/03 11:21:44 INFO client.AHSProxy: Connecting to Application History server at bg3.test.com.cn/10.128.2.121:10200
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):2
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1611552872177_1446 SparkSQL::10.128.2.210 SPARK hbase default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4041
application_1611552872177_0144 Thrift JDBC/ODBC Server SPARK spark default RUNNING UNDEFINED 10% http://bg1.test.com.cn:4040
接着执行
su - yarn-ats
kinit -kt /etc/security/keytabs/yarn-ats.hbase-client.headless.keytab yarn-ats-test_data@HADOOP.COM
hdfs dfs -rm -R ./3.1.0.0-78/*
hdfs dfs -ls ./3.1.0.0-78/*
exit
su - hdfs
kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-test_data@HADOOP.COM
hadoop fs -rm -R /services/sync/yarn-ats/hbase.yarnfile
参考ambari hdfs 启动报错_Ambari 环境启动时遇到的一些问题记录
这里更改了几个配置, 将yarn中的/atsv2-hbase-secure
,更改为hbase的/hbase
,直接更改会有问题
参考Yarn timeline service v2.0启动成功但查询日志报错:AbstractChannel$AnnotatedConnectException: Connection refused,下面的配置是不对的,因为yarn-ats应该使用内部的hbase,而不应该采用外部的hbase。
use_external_hbase=true
is_hbase_system_service_launch=true
执行命令
su - hdfs
kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-test_data@HADOOP.COM
hdfs dfs -rm -r /atsv2
hdfs dfs -ls /atsv2
参考Configure External HBase for Timeline Service 2.0,将use_external_hbase
设置为true
tail -fn100 /appdata/home/hadoop/var/log/hadoop-yarn/yarn/hadoop-yarn-resourcemanager-bg4.test.com.cn.log
这里再阅读以下HDP 之 Timeline Service 2.0
Timeline Service v.2 (HDP3.1 )参数配置以及相关环境
su - hbase
kinit -kt /etc/security/keytabs/hbase.headless.keytab hbase-test_data@HADOOP.COM
hbase --config /etc/hadoop/3.1.0.0-78/0/embedded-yarn-ats-hbase shell
如果是正常的
TABLE
prod.timelineserviceapp_flow
prod.timelineservice.application
prod.timelineservice.entity
prod.timelineservice.flowactivity
prod.timelineservice.flowrun
prod.timelineservice.subapplication
6 row(s)
Took 0.0257 seconds
=> ["prod.timelineservice.app_flow", "prod.timelineservice.application", "prod.timelineservice.entity", "prod.timelineservice.flowactivity", "prod.timelineservice.flowrun", "prod.timelineservice.subapplication"]
异常的是
ERROR: KeeperErrorCode = NoNode for /atsv2-hbase-secure/master
Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The
default is 'summary'. Examples:
hbase> status
hbase> status 'simple'
hbase> status 'summary'
hbase> status 'detailed'
hbase> status 'replication'
hbase> status 'replication', 'source'
hbase> status 'replication', 'sink'
将is_hbase_system_service_launch
和use_external_hbase
设置为false,提示的异常信息如下:
2021-02-03 14:54:21,077 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=36, started=38336 ms ago, cancelled=false, msg=org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-secure/meta-region-server, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at null
至此问题解决,主要原因是因为Advanced yarn-hbase-site
中hbase.master.info.port、HBase Master Port、hbase.regionserver.info.port、hbase.regionserver.port不应该与hbase相同,因为hbase相当对于yarn-ats
来说是外部。
再者
use_external_hbase=false
is_hbase_system_service_launch=false
[root@sh102 ~]# lsof -i:17010
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 28419 yarn-ats 570u IPv6 538518 0t0 TCP *:17010 (LISTEN)
下面是yarn中yarn-hbase-site
配置,
下面是hbase中hbase-site
的端口配置
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。