赞
踩
在Ubuntu20.04部署Flink1.17实现基于Flink GateWay的Hive On Flink的踩坑记录(一)
转眼间,Flink1.14还没玩明白,Flink已经1.17了,这迭代速度还是够快。。。
之前写过一篇:https://lizhiyong.blog.csdn.net/article/details/128195438
是FFA2022展示的Flink1.16新特性:Flink GateWay。新版本1.17据说GA了,可以尝试下。
原理当然和Hive On Tez当然是有所不同,具体参考之前写过的这2篇:
https://lizhiyong.blog.csdn.net/article/details/126634843
https://lizhiyong.blog.csdn.net/article/details/126688391
由于笔者目前已经有这些虚拟机:
USDP可互通双集群:https://lizhiyong.blog.csdn.net/article/details/123389208
zhiyong1 :192.168.88.100
zhiyong2 :192.168.88.101
zhiyong3 :192.168.88.102
zhiyong4 :192.168.88.103
zhiyong5 :192.168.88.104
zhiyong6 :192.168.88.105
zhiyong7 :192.168.88.106
K8S的All in one:https://lizhiyong.blog.csdn.net/article/details/126236516
zhiyong-ksp1 :192.168.88.20
开发机:zhiyong-vm-dev :192.168.88.50
Doris:https://lizhiyong.blog.csdn.net/article/details/126338539
zhiyong-doris :192.168.88.21
Clickhouse:https://lizhiyong.blog.csdn.net/article/details/126737711
zhiyong-ck1 :192.168.88.22
Docker机:https://lizhiyong.blog.csdn.net/article/details/126761470
zhiyong-docker :192.168.88.23
Win10跳板机:https://lizhiyong.blog.csdn.net/article/details/127641326
跳板机 :192.168.88.25
所以,没有搭集群的必要,搞个单节点随便玩玩,带新人/给项目组那些肤浅的SQL Boy们练习HQL足够了,不用时也可以随时挂起,这种用途,单节点挺方便的,毕竟除了SQL就什么都不会的SQL Boy们从来不知道分布式的各种原理【当然也不需要知道】。
所以这台虚拟机的IP规划为:192.168.88.24,其实也是蓄谋已久。。。年前就预留了这个坑位,只是工作繁忙,一直没能腾出时间。
参考:https://lizhiyong.blog.csdn.net/article/details/126338539
基本和之前一样的。。。不过Doris已经2.0.0了:https://doris.apache.org/zh-CN/download/
还是要向前看的。
由于是All In One的模式,所以资源配置的稍微阔绰点,防止出现OOM:
设置网络:
安装必要的命令:
sudo apt install net-tools
sudo apt-get install openssh-server
sudo apt-get install openssh-client
sudo apt install vim
此时可以使用MobaXterm
。
配置SSH免密:
zhiyong@zhiyong-hive-on-flink1:~$ sudo -su root root@zhiyong-hive-on-flink1:/home/zhiyong# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa Your public key has been saved in /root/.ssh/id_rsa.pub The key fingerprint is: SHA256:sLPlghwaubeie5ddmWAOzWGJvA0+1nfqzunllZ1vXp4 root@zhiyong-hive-on-flink1 The key's randomart image is: +---[RSA 3072]----+ | . . . | | + + | | . O.. | | .* Bo. . | | o..=ooS= | | = o.== o . | | o +ooo. . o o .| | o.o..o.+ . o+| |o+ o. o= . E+| +----[SHA256]-----+ root@zhiyong-hive-on-flink1:/home/zhiyong# ssh-copy-id zhiyong-hive-on-flink1.17 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@zhiyong-hive-on-flink1.17's password: Permission denied, please try again. root@zhiyong-hive-on-flink1.17's password: root@zhiyong-hive-on-flink1:/home/zhiyong# cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 zhiyong-hive-on-flink1.17 zhiyong-hive-on-flink1 # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters root@zhiyong-hive-on-flink1:/home/zhiyong#sudo vim /etc/ssh/sshd_config :set nu 34 #PermitRootLogin prohibit-password #此处需要修改,才能以root做ssh登录 PermitRootLogin yes #35行+入 esc :wq zhiyong@zhiyong-hive-on-flink1:~$ sudo su root [sudo] zhiyong 的密码: root@zhiyong-hive-on-flink1:/home/zhiyong# sudo passwd root 新的 密码: 重新输入新的 密码: passwd:已成功更新密码 root@zhiyong-hive-on-flink1:/home/zhiyong# reboot root@zhiyong-hive-on-flink1:/home/zhiyong# ssh-copy-id zhiyong-hive-on-flink1.17 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@zhiyong-hive-on-flink1.17's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'zhiyong-hive-on-flink1.17'" and check to make sure that only the key(s) you wanted were added. root@zhiyong-hive-on-flink1:/home/zhiyong#
此时做好了免密SSH。
根据官网文档:https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/try-flink/local_installation/
Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows). You need to have Java 11 installed
所以JDK1.8有淘汰的趋势。。。Flink早在1.15就要求使用JDK11,主要是为了用上比G1更优秀的ZGC,毕竟吞吐量下降15%只要多+20%的机器就可以弥补,有限Money能解决的问题并不是太大的问题,但是老一些的GC万一STW来个几秒钟,那Flink所谓的亚秒级实时响应就无从谈起了。ZGC保证了4TB内存时暂停时间控制在15ms以内,还是很适合Flink使用的。JDK15中ZGC达到了GA【使用–XX:+UseZGC
开启】,目前Oracle主推的LTS在1.8、11后就是17了。。。所以JDK17才是未来。。。由于不会在这个虚拟机做开发和编译,使用JRE其实也可以。
zhiyong@zhiyong-hive-on-flink1:/usr/lib/jvm/java-11-openjdk-amd64/bin$ cd zhiyong@zhiyong-hive-on-flink1:~$ sudo apt remove openjdk-11-jre-headless 正在读取软件包列表... 完成 正在分析软件包的依赖关系树 正在读取状态信息... 完成 软件包 openjdk-11-jre-headless 未安装,所以不会被卸载 下列软件包是自动安装的并且现在不需要了: java-common 使用'sudo apt autoremove'来卸载它(它们)。 升级了 0 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 355 个软件包未被升级。 zhiyong@zhiyong-hive-on-flink1:~$ java Command 'java' not found, but can be installed with: sudo apt install openjdk-11-jre-headless # version 11.0.18+10-0ubuntu1~20.04.1, or sudo apt install default-jre # version 2:1.11-72 sudo apt install openjdk-13-jre-headless # version 13.0.7+5-0ubuntu1~20.04 sudo apt install openjdk-16-jre-headless # version 16.0.1+9-1~20.04 sudo apt install openjdk-17-jre-headless # version 17.0.6+10-0ubuntu1~20.04.1 sudo apt install openjdk-8-jre-headless # version 8u362-ga-0ubuntu1~20.04.1
此时卸载JDK,接下来就是配置$JAVA_HOME
:
zhiyong@zhiyong-hive-on-flink1:~$ sudo su root root@zhiyong-hive-on-flink1:/home/zhiyong# mkdir -p /export/software root@zhiyong-hive-on-flink1:/home/zhiyong# mkdir -p /export/server root@zhiyong-hive-on-flink1:/home/zhiyong# chmod -R 777 /export/ root@zhiyong-hive-on-flink1:/home/zhiyong# cd /export/software/ root@zhiyong-hive-on-flink1:/export/software# ll 总用量 8 drwxrwxrwx 2 root root 4096 5月 14 15:55 ./ drwxrwxrwx 4 root root 4096 5月 14 15:55 ../ root@zhiyong-hive-on-flink1:/export/software# cp /home/zhiyong/jdk-17_linux-x64_bin.tar.gz /export/software/ root@zhiyong-hive-on-flink1:/export/software# ll 总用量 177472 drwxrwxrwx 2 root root 4096 5月 14 15:56 ./ drwxrwxrwx 4 root root 4096 5月 14 15:55 ../ -rw-r--r-- 1 root root 181719178 5月 14 15:56 jdk-17_linux-x64_bin.tar.gz root@zhiyong-hive-on-flink1:/export/software# cd /export/server/ root@zhiyong-hive-on-flink1:/export/server# tar -zxvf jdk-17_linux-x64_bin.tar.gz -C /export/server/ root@zhiyong-hive-on-flink1:/export/server# ll 总用量 12 drwxrwxrwx 3 root root 4096 5月 14 15:57 ./ drwxrwxrwx 4 root root 4096 5月 14 15:55 ../ drwxr-xr-x 9 root root 4096 5月 14 15:57 jdk-17.0.7/ root@zhiyong-hive-on-flink1:/export/server# cat /etc/profile # /etc/profile: system-wide .profile file for the Bourne shell (sh(1)) # and Bourne compatible shells (bash(1), ksh(1), ash(1), ...). if [ "${PS1-}" ]; then if [ "${BASH-}" ] && [ "$BASH" != "/bin/sh" ]; then # The file bash.bashrc already sets the default PS1. # PS1='\h:\w\$ ' if [ -f /etc/bash.bashrc ]; then . /etc/bash.bashrc fi else if [ "`id -u`" -eq 0 ]; then PS1='# ' else PS1='$ ' fi fi fi if [ -d /etc/profile.d ]; then for i in /etc/profile.d/*.sh; do if [ -r $i ]; then . $i fi done unset i fi export JAVA_HOME=/export/server/jdk-17.0.7 export PATH=:$PATH:$JAVA_HOME/bin root@zhiyong-hive-on-flink1:/export/server# java -version java version "17.0.7" 2023-04-18 LTS Java(TM) SE Runtime Environment (build 17.0.7+8-LTS-224) Java HotSpot(TM) 64-Bit Server VM (build 17.0.7+8-LTS-224, mixed mode, sharing) root@zhiyong-hive-on-flink1:/export/server#
此时JDK17部署完毕。【但是JDK17目前还有很多问题,之后笔者更换了JDK11】。
去官网找最新版:https://hadoop.apache.org/releases.html
参照官网文档:https://hadoop.apache.org/docs/current/
当然是安装单节点:https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
Apache Hadoop 3.3 and upper supports Java 8 and Java 11 (runtime only)。。。
试一试JDK17能不能运行。。。
root@zhiyong-hive-on-flink1:/export/software# cp /home/zhiyong/hadoop-3.3.5.tar.gz /export/software/
root@zhiyong-hive-on-flink1:/export/software# tar -zxvf hadoop-3.3.5.tar.gz -C /export/server
root@zhiyong-hive-on-flink1:/export/software# cd /export/server
root@zhiyong-hive-on-flink1:/export/server# ll
总用量 16
drwxrwxrwx 4 root root 4096 5月 14 17:25 ./
drwxrwxrwx 4 root root 4096 5月 14 15:55 ../
drwxr-xr-x 10 2002 2002 4096 3月 16 00:58 hadoop-3.3.5/
drwxr-xr-x 9 root root 4096 5月 14 15:57 jdk-17.0.7/
root@zhiyong-hive-on-flink1:/export/server# cd hadoop-3.3.5/
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5/etc/hadoop# chmod 666 core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml
直接在Ubuntu的GUI修改即可:
修改core-site.xml
:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.88.24:9000</value>
</property>
</configuration>
修改hdfs-site.xml
:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>192.168.88.24:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.88.24:50090</value>
</property>
</configuration>
初始化:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# pwd
/export/server/hadoop-3.3.5
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hdfs namenode -format
2023-05-14 17:54:12,552 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
这条Log说明初始化成功。
启动HDFS:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/start-dfs.sh
Starting namenodes on [192.168.88.24]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [zhiyong-hive-on-flink1]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5#
果然会报错。
这个脚本有这么一句:
## startup matrix:
#
# if $EUID != 0, then exec
# if $EUID =0 then
# if hdfs_subcmd_user is defined, su to that user, exec
# if hdfs_subcmd_user is not defined, error
#
# For secure daemons, this means both the secure and insecure env vars need to be
# defined. e.g., HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs
#
所以需要给这个脚本增加配置:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# pwd /export/server/hadoop-3.3.5 root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# vim ./sbin/start-dfs.sh HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/start-dfs.sh WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER. Starting namenodes on [192.168.88.24] 192.168.88.24: ERROR: JAVA_HOME is not set and could not be found. Starting datanodes localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts. localhost: ERROR: JAVA_HOME is not set and could not be found. Starting secondary namenodes [zhiyong-hive-on-flink1] zhiyong-hive-on-flink1: Warning: Permanently added 'zhiyong-hive-on-flink1' (ECDSA) to the list of known hosts. zhiyong-hive-on-flink1: ERROR: JAVA_HOME is not set and could not be found. root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# echo $JAVA_HOME /export/server/jdk-17.0.7 root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# vim ./etc/hadoop/hadoop-env.sh
所以:
# The java implementation to use. By default, this environment
53 # variable is REQUIRED on ALL platforms except OS X!
54 # export JAVA_HOME=
55 export JAVA_HOME=$JAVA_HOME
这样不管用,需要写死:
export JAVA_HOME=$JAVA_HOME=/export/server/jdk-17.0.7
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/start-dfs.sh
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [192.168.88.24]
Starting datanodes
Starting secondary namenodes [zhiyong-hive-on-flink1]
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# jps
5232 DataNode
5668 Jps
5501 SecondaryNameNode
5069 NameNode
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5#
此时启动成功,但是这个命令失败:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/stop-dfs.sh
Stopping namenodes on [192.168.88.24]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Stopping datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Stopping secondary namenodes [zhiyong-hive-on-flink1]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5#
所以:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# vim ./sbin/stop-dfs.sh
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/stop-dfs.sh
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Stopping namenodes on [192.168.88.24]
Stopping datanodes
Stopping secondary namenodes [zhiyong-hive-on-flink1]
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# jps
6507 Jps
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5#
才能执行这个命令。
然后重启HDFS:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/start-dfs.sh
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [192.168.88.24]
Starting datanodes
Starting secondary namenodes [zhiyong-hive-on-flink1]
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# jps
6756 NameNode
7348 Jps
6921 DataNode
7194 SecondaryNameNode
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5#
可以看到Web UI:
http://192.168.88.24:50070/
但是遇到了:Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error
当然是因为JDK版本太高,导致了丢包。在JDK9标识为过期的:java.activation
在JDK11完全删除了!!!到JDK17当然是没有了。。。凑合着用。。。
验证:
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# vim /home/zhiyong/test1.txt
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# cat /home/zhiyong/test1.txt
用于验证文件是否发送成功 by:CSDN@虎鲸不是鱼
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -put /home/zhiyong/test1.txt hdfs://192.168.88.24:9000/test1
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -ls hdfs://192.168.88.24:9000/test1
Found 1 items
-rw-r--r-- 1 root supergroup 63 2023-05-14 19:11 hdfs://192.168.88.24:9000/test1/test1.txt
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -cat hdfs://192.168.88.24:9000/test1/test1.txt
用于验证文件是否发送成功 by:CSDN@虎鲸不是鱼
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5#
说明JDK17环境下,HDFS可以凑合使用。。。
root@zhiyong-hive-on-flink1:~# sudo apt-get install mysql-server root@zhiyong-hive-on-flink1:~# cd /etc/mysql/ root@zhiyong-hive-on-flink1:/etc/mysql# ll 总用量 40 drwxr-xr-x 4 root root 4096 5月 14 19:23 ./ drwxr-xr-x 132 root root 12288 5月 14 19:23 ../ drwxr-xr-x 2 root root 4096 2月 23 2022 conf.d/ -rw------- 1 root root 317 5月 14 19:23 debian.cnf -rwxr-xr-x 1 root root 120 4月 21 22:17 debian-start* lrwxrwxrwx 1 root root 24 5月 14 14:59 my.cnf -> /etc/alternatives/my.cnf -rw-r--r-- 1 root root 839 8月 3 2016 my.cnf.fallback -rw-r--r-- 1 root root 682 11月 16 04:42 mysql.cnf drwxr-xr-x 2 root root 4096 5月 14 19:23 mysql.conf.d/ root@zhiyong-hive-on-flink1:/etc/mysql# cat debian.cnf # Automatically generated for Debian scripts. DO NOT TOUCH! [client] host = localhost user = debian-sys-maint password = PnqdmcrBnP2vLCE8 socket = /var/run/mysqld/mysqld.sock [mysql_upgrade] host = localhost user = debian-sys-maint password = PnqdmcrBnP2vLCE8 socket = /var/run/mysqld/mysqld.sock root@zhiyong-hive-on-flink1:/etc/mysql# mysql mysql> ALTER USER root@localhost IDENTIFIED BY '123456'; Query OK, 0 rows affected (0.00 sec) mysql> exit Bye root@zhiyong-hive-on-flink1:/etc/mysql# pwd /etc/mysql root@zhiyong-hive-on-flink1:/etc/mysql# ll 总用量 40 drwxr-xr-x 4 root root 4096 5月 14 19:23 ./ drwxr-xr-x 132 root root 12288 5月 14 19:23 ../ drwxr-xr-x 2 root root 4096 2月 23 2022 conf.d/ -rw------- 1 root root 317 5月 14 19:23 debian.cnf -rwxr-xr-x 1 root root 120 4月 21 22:17 debian-start* lrwxrwxrwx 1 root root 24 5月 14 14:59 my.cnf -> /etc/alternatives/my.cnf -rw-r--r-- 1 root root 839 8月 3 2016 my.cnf.fallback -rw-r--r-- 1 root root 682 11月 16 04:42 mysql.cnf drwxr-xr-x 2 root root 4096 5月 14 19:23 mysql.conf.d/ root@zhiyong-hive-on-flink1:/etc/mysql# cd mysql.conf.d/ root@zhiyong-hive-on-flink1:/etc/mysql/mysql.conf.d# ll 总用量 16 drwxr-xr-x 2 root root 4096 5月 14 19:23 ./ drwxr-xr-x 4 root root 4096 5月 14 19:23 ../ -rw-r--r-- 1 root root 132 11月 16 04:42 mysql.cnf -rw-r--r-- 1 root root 2220 11月 16 04:42 mysqld.cnf root@zhiyong-hive-on-flink1:/etc/mysql/mysql.conf.d# vim mysqld.cnf #bind-address = 127.0.0.1 #屏蔽这一句才能远程连接 root@zhiyong-hive-on-flink1:/etc/mysql/mysql.conf.d# mysql create user 'root'@'%' identified by '123456'; grant all privileges on *.* to 'root'@'%' with grant option; flush privileges;
授权后尝试使用DataGrip可以连接:
元数据库MySQL准备好以后,可以准备安装Hive。
root@zhiyong-hive-on-flink1:/export/software# cp /home/zhiyong/apache-hive-3.1.3-bin.tar.gz /export/software/ root@zhiyong-hive-on-flink1:/export/software# ll 总用量 1186736 drwxrwxrwx 2 root root 4096 5月 14 19:58 ./ drwxrwxrwx 4 root root 4096 5月 14 15:55 ../ -rw-r--r-- 1 root root 326940667 5月 14 19:57 apache-hive-3.1.3-bin.tar.gz -rw-r--r-- 1 root root 706533213 5月 14 17:23 hadoop-3.3.5.tar.gz -rw-r--r-- 1 root root 181719178 5月 14 15:56 jdk-17_linux-x64_bin.tar.gz root@zhiyong-hive-on-flink1:/export/software# tar -zxvf apache-hive-3.1.3-bin.tar.gz -C /export/server/ root@zhiyong-hive-on-flink1:/export/software# cd /export/server/ root@zhiyong-hive-on-flink1:/export/server# ll 总用量 20 drwxrwxrwx 5 root root 4096 5月 14 20:00 ./ drwxrwxrwx 4 root root 4096 5月 14 15:55 ../ drwxr-xr-x 10 root root 4096 5月 14 20:00 apache-hive-3.1.3-bin/ drwxr-xr-x 11 2002 2002 4096 5月 14 17:54 hadoop-3.3.5/ drwxr-xr-x 9 root root 4096 5月 14 15:57 jdk-17.0.7/ root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/lib# cp /home/zhiyong/mysql-connector-java-8.0.28.jar /export/server/apache-hive-3.1.3-bin/lib/ root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/lib# ll | grep mysql -rw-r--r-- 1 root root 2476480 5月 14 20:07 mysql-connector-java-8.0.28.jar -rw-r--r-- 1 root staff 10476 12月 20 2019 mysql-metadata-storage-0.12.0.jar root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# pwd /export/server/apache-hive-3.1.3-bin/conf root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# cp ./hive-env.sh.template hive-env.sh root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# vim hive-env.sh 增加: HADOOP_HOME=/export/server/hadoop-3.3.5 export HIVE_CONF_DIR=/export/server/apache-hive-3.1.3-bin/conf root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# vim /etc/profile 末尾增加: export HIVE_HOME=/export/server/apache-hive-3.1.3-bin export PATH=:$PATH:$HIVE_HOME/bin root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# source /etc/profile root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# touch hive-site.xml root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# chmod 666 hive-site.xml
在Ubuntu的Gui写入配置:
<configuration> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.88.24:3306/hivemetadata?createDatabaseIfNotExist=true&useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>192.168.88.24</value> </property> </configuration>
创建Hive在HDFS的路径:
/export/server/hadoop-3.3.5/bin/hadoop fs -mkdir -p /user/hive/warehouse
/export/server/hadoop-3.3.5/bin/hadoop fs -mkdir -p /tmp
/export/server/hadoop-3.3.5/bin/hadoop fs -chmod g+w /tmp
/export/server/hadoop-3.3.5/bin/hadoop fs -chmod g+w /user/hive/warehouse
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# /export/server/hadoop-3.3.5/bin/hadoop fs -ls /
Found 3 items
drwxr-xr-x - root supergroup 0 2023-05-14 19:11 /test1
drwxrwxr-x - root supergroup 0 2023-05-14 20:27 /tmp
drwxr-xr-x - root supergroup 0 2023-05-14 20:26 /user
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf#
接下来初始化Hive的元数据:
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# pwd
/export/server/apache-hive-3.1.3-bin/bin
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# schematool -dbType mysql -initSchema
Initialization script completed
schemaTool completed
然后启动Hive:
hive --service metastore > /dev/null 2>&1 & hiveserver2 > /dev/null 2>&1 & root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# jps 12513 RunJar 7698 NameNode 12694 RunJar 7302 SecondaryNameNode 7065 DataNode 12844 Jps root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# beeline -u jdbc:hive2://localhost:10000/ -n root 失败 root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/server/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/server/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = 957f3413-ca47-43da-a6a4-c5bbf9597de5 Exception in thread "main" java.lang.ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are in module java.base of loader 'bootstrap') at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:413) at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:389) at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:60) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.apache.hadoop.util.RunJar.run(RunJar.java:328) at org.apache.hadoop.util.RunJar.main(RunJar.java:241) root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# netstat -atunlp | grep 9083 tcp6 0 0 :::9083 :::* LISTEN 12513/java
但是MetaStore启动成功!!!
显然这又是JDK的问题。。。Hive貌似只对JDK1.8友好。
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# jps
12513 RunJar
7698 NameNode
12694 RunJar
7302 SecondaryNameNode
13334 Jps
7065 DataNode
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# kill -9 12513
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# kill -9 12694
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# hive --service metastore > /dev/null 2>&1 &
[1] 13397
参照:https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/table/sql-gateway/overview/
root@zhiyong-hive-on-flink1:/export/software# cp /home/zhiyong/flink-1.17.0-bin-scala_2.12.tgz /export/software/
root@zhiyong-hive-on-flink1:/export/software# tar -zxvf flink-1.17.0-bin-scala_2.12.tgz -C /export/server/
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0# ./bin/sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2
启动脚本执行后并没有什么反应。
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# sql-client.sh
Flink SQL> show databases;
+------------------+
| database name |
+------------------+
| default_database |
+------------------+
1 row in set
Flink SQL> select 1 as col1;
[ERROR] Could not execute SQL statement. Reason:
java.lang.reflect.InaccessibleObjectException: Unable to make field private static final int java.lang.Class.ANNOTATION accessible: module java.base does not "opens java.lang" to unnamed module @74582ff6
Flink SQL>
显然Flink1.17并不支持JDK17!!!所以还是应该老老实实用JDK11。
由于Oracle的JDK11下载需要注册,所以下载个OpenJKD:http://jdk.java.net/archive/
root@zhiyong-hive-on-flink1:/export/software# ll 总用量 1645100 drwxrwxrwx 2 root root 4096 5月 14 20:50 ./ drwxrwxrwx 4 root root 4096 5月 14 15:55 ../ -rw-r--r-- 1 root root 326940667 5月 14 19:57 apache-hive-3.1.3-bin.tar.gz -rw-r--r-- 1 root root 469363537 5月 14 20:50 flink-1.17.0-bin-scala_2.12.tgz -rw-r--r-- 1 root root 706533213 5月 14 17:23 hadoop-3.3.5.tar.gz -rw-r--r-- 1 root root 181719178 5月 14 15:56 jdk-17_linux-x64_bin.tar.gz root@zhiyong-hive-on-flink1:/export/software# cp /home/zhiyong/openjdk-11_linux-x64_bin.tar.gz /export/software/ root@zhiyong-hive-on-flink1:/export/software# tar -zxvf openjdk-11_linux-x64_bin.tar.gz -C /export/server/ root@zhiyong-hive-on-flink1:/export/server/jdk-11# pwd /export/server/jdk-11 root@zhiyong-hive-on-flink1:/export/server/jdk-11# vim /etc/profile #修改:export JAVA_HOME=/export/server/jdk-11 root@zhiyong-hive-on-flink1:/export/server/jdk-11# source /etc/profile root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# pwd /export/server/hadoop-3.3.5 root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# vim ./etc/hadoop/hadoop-env.sh #修改:export JAVA_HOME=/export/server/jdk-11 root@zhiyong-hive-on-flink1:/home/zhiyong# java -version openjdk version "11" 2018-09-25 OpenJDK Runtime Environment 18.9 (build 11+28) OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode) root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/start-dfs.sh WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER. Starting namenodes on [192.168.88.24] Starting datanodes Starting secondary namenodes [192.168.88.24] root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hdfs namenode -format root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./sbin/start-dfs.sh root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -put /home/zhiyong/test1.txt hdfs://192.168.88.24:9000/test1 root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -ls hdfs://192.168.88.24:9000/test1 -rw-r--r-- 1 root supergroup 63 2023-05-14 22:51 hdfs://192.168.88.24:9000/test1 root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -cat hdfs://192.168.88.24:9000/test1 用于验证文件是否发送成功 by:CSDN@虎鲸不是鱼 root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -mkdir -p /user/hive/warehouse root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -mkdir -p /tmp root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -chmod g+w /tmp root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -chmod g+w /user/hive/warehouse root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5# ./bin/hadoop fs -ls / Found 3 items -rw-r--r-- 1 root supergroup 63 2023-05-14 22:51 /test1 drwxrwxr-x - root supergroup 0 2023-05-14 22:55 /tmp drwxr-xr-x - root supergroup 0 2023-05-14 22:55 /user root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# hive --service metastore > /dev/null 2>&1 & [1] 4797 root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/server/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/server/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = e5d7b83d-9cf6-4cde-a945-511b919da96a Exception in thread "main" java.lang.ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are in module java.base of loader 'bootstrap') at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:413) at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:389) at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:60) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.hadoop.util.RunJar.run(RunJar.java:328) at org.apache.hadoop.util.RunJar.main(RunJar.java:241) root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin#
显然Hive还是只能JDK1.8,对JDK11的支持也很不友好,毕竟Hive这玩意儿太古老了。。。
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# cp /home/zhiyong/flink-connector-hive_2.12-1.17.0.jar /export/server/flink-1.17.0/lib root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# cp /export/server/apache-hive-3.1.3-bin/lib/hive-exec-3.1.3.jar /export/server/flink-1.17.0/lib root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# cp /export/server/apache-hive-3.1.3-bin/lib/hive-metastore-3.1.3.jar /export/server/flink-1.17.0/lib root@zhiyong-hive-on-flink1:/home/zhiyong# cp /home/zhiyong/antlr-runtime-3.5.2.jar /export/server/flink-1.17.0/lib root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# ll 总用量 261608 drwxr-xr-x 2 root root 4096 5月 14 23:41 ./ drwxr-xr-x 10 root root 4096 3月 17 20:22 ../ -rw-r--r-- 1 root root 167761 5月 14 23:41 antlr-runtime-3.5.2.jar -rw-r--r-- 1 root root 196487 3月 17 20:07 flink-cep-1.17.0.jar -rw-r--r-- 1 root root 542616 3月 17 20:10 flink-connector-files-1.17.0.jar -rw-r--r-- 1 root root 8876209 5月 14 23:26 flink-connector-hive_2.12-1.17.0.jar -rw-r--r-- 1 root root 102468 3月 17 20:14 flink-csv-1.17.0.jar -rw-r--r-- 1 root root 135969953 3月 17 20:22 flink-dist-1.17.0.jar -rw-r--r-- 1 root root 180243 3月 17 20:13 flink-json-1.17.0.jar -rw-r--r-- 1 root root 21043313 3月 17 20:20 flink-scala_2.12-1.17.0.jar -rw-r--r-- 1 root root 15407474 3月 17 20:21 flink-table-api-java-uber-1.17.0.jar -rw-r--r-- 1 root root 37975208 3月 17 20:15 flink-table-planner-loader-1.17.0.jar -rw-r--r-- 1 root root 3146205 3月 17 20:07 flink-table-runtime-1.17.0.jar -rw-r--r-- 1 root root 41873153 5月 14 23:29 hive-exec-3.1.3.jar -rw-r--r-- 1 root root 36983 5月 14 23:29 hive-metastore-3.1.3.jar -rw-r--r-- 1 root root 208006 3月 17 17:31 log4j-1.2-api-2.17.1.jar -rw-r--r-- 1 root root 301872 3月 17 17:31 log4j-api-2.17.1.jar -rw-r--r-- 1 root root 1790452 3月 17 17:31 log4j-core-2.17.1.jar -rw-r--r-- 1 root root 24279 3月 17 17:31 log4j-slf4j-impl-2.17.1.jar root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# pwd /export/server/flink-1.17.0/bin root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# start-cluster.sh Starting cluster. Starting standalonesession daemon on host zhiyong-hive-on-flink1. Starting taskexecutor daemon on host zhiyong-hive-on-flink1. root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0# ./bin/flink run examples/streaming/WordCount.jar Executing example with default input data. Use --input to specify file input. Printing result to stdout. Use --output to specify output path. WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/export/server/flink-1.17.0/lib/flink-dist-1.17.0.jar) to field java.lang.String.value WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Job has been submitted with JobID 2149aad493c0ab55386d31d1c1663be2 Program execution finished Job with JobID 2149aad493c0ab55386d31d1c1663be2 has finished. Job Runtime: 1014 ms root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0# tail log/flink-*-taskexecutor-*.out (nymph,1) (in,3) (thy,1) (orisons,1) (be,4) (all,2) (my,1) (sins,1) (remember,1) (d,4)
说明Flink此时还算正常。
参考:https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/table/hive/hive_catalog/
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# sql-client.sh
Flink SQL> CREATE CATALOG zhiyonghive WITH (
> 'type' = 'hive',
> 'hive-conf-dir' = '/export/server/apache-hive-3.1.3-bin/conf/'
> );
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/hadoop-common-3.1.1.jar /export/server/flink-1.17.0/lib/
Flink SQL> CREATE CATALOG zhiyonghive WITH (
> 'type' = 'hive',
> 'hive-conf-dir' = '/export/server/apache-hive-3.1.3-bin/conf/'
> );
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper
此时又爆了一种异常。。。
使用这个GAV:
<dependency>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
<version>5.0.3</version>
</dependency>
下载Jar包,放入lib。。。
root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/woodstox-core-5.0.3.jar /export/server/flink-1.17.0/lib/ root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/stax2-api-3.1.4.jar /export/server/flink-1.17.0/lib/ Flink SQL> CREATE CATALOG zhiyonghive WITH ( > 'type' = 'hive', > 'hive-conf-dir' = '/export/server/apache-hive-3.1.3-bin/conf/' > ); [ERROR] Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/commons-logging-1.2.jar /export/server/flink-1.17.0/lib/ [ERROR] Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobConf root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/hadoop-mapreduce-client-core-3.1.1.jar /export/server/flink-1.17.0/lib/ [ERROR] Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: org.apache.commons.configuration2.Configuration root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/commons-configuration2-2.1.1.jar /export/server/flink-1.17.0/lib/ [ERROR] Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/hadoop-auth-3.1.0.jar /export/server/flink-1.17.0/lib/ [ERROR] Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: org.apache.htrace.core.Tracer$Builder root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/htrace-core4-4.1.0-incubating.jar /export/server/flink-1.17.0/lib/ [ERROR] Could not execute SQL statement. Reason: java.lang.IllegalArgumentException: Embedded metastore is not allowed. Make sure you have set a valid value for hive.metastore.uris
还需要修改Hive的hive-site.xml
配置文件:
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.88.24:9083</value>
</property>
然后:
root@zhiyong-hive-on-flink1:/export/server/jdk-11/bin# cd /export/server/apache-hive-3.1.3-bin/bin
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# hive --service metastore > /dev/null 2>&1 &
[1] 10835
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: com.facebook.fb303.FacebookService$Iface
root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/libfb303-0.9.3.jar /export/server/flink-1.17.0/lib/
[ERROR] Could not execute SQL statement. Reason:
java.net.ConnectException: 拒绝连接 (Connection refused)
显然是Hive的MetaStore又挂了:
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# ./hive --service metastore Caused by: com.mysql.cj.exceptions.UnableToConnectException: Public Key Retrieval is not allowed at jdk.internal.reflect.GeneratedConstructorAccessor79.newInstance(Unknown Source) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61) at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:85) at com.mysql.cj.protocol.a.authentication.CachingSha2PasswordPlugin.nextAuthenticationStep(CachingSha2PasswordPlugin.java:130) at com.mysql.cj.protocol.a.authentication.CachingSha2PasswordPlugin.nextAuthenticationStep(CachingSha2PasswordPlugin.java:49) at com.mysql.cj.protocol.a.NativeAuthenticationProvider.proceedHandshakeWithPluggableAuthentication(NativeAuthenticationProvider.java:445) at com.mysql.cj.protocol.a.NativeAuthenticationProvider.connect(NativeAuthenticationProvider.java:211) at com.mysql.cj.protocol.a.NativeProtocol.connect(NativeProtocol.java:1369) at com.mysql.cj.NativeSession.connect(NativeSession.java:133) at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:949) at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:819) ... 74 more
这就是使用了新版本MySQL的坏处!!!
修改hive的配置文件:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.88.24:3306/hivemetadata?createDatabaseIfNotExist=true&allowPublicKeyRetrieval=true&useSSL=false&serviceTimezone=UTC</value>
</property>
还是报错:
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '=' (code 61); expected a semi-colon after the reference for entity 'useSSL'
at [row,col,system-id]: [12,124,"file:/export/server/apache-hive-3.1.3-bin/conf/hive-site.xml"]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:666)
at com.ctc.wstx.sr.StreamScanner.parseEntityName(StreamScanner.java:2080)
at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1538)
at com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4765)
at com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3789)
at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3743)
... 17 more
需要改为:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.88.24:3306/hivemetadata?createDatabaseIfNotExist=true&allowPublicKeyRetrieval=true&useSSL=false&serviceTimezone=UTC</value>
</property>
此时Flink的SqlClient可以成功创建Catalog:
Flink SQL> CREATE CATALOG zhiyonghive WITH (
> 'type' = 'hive',
> 'hive-conf-dir' = '/export/server/apache-hive-3.1.3-bin/conf/'
> );
[INFO] Execute statement succeed.
让Hive的MetaStore保持后台常驻:
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/bin# hive --service metastore > /dev/null 2>&1 & [1] 12138 Flink SQL> use catalog zhiyonghive; [INFO] Execute statement succeed. Flink SQL> show databases; +---------------+ | database name | +---------------+ | default | +---------------+ 1 row in set Flink SQL> create database if not exists zhiyong_flink_db; [INFO] Execute statement succeed. Flink SQL> show databases; +------------------+ | database name | +------------------+ | default | | zhiyong_flink_db | +------------------+ 2 rows in set
连接到MySQL查看元数据:
select * from hivemetadata.DBS;
显然此时Flink集成Hive成功!!!
Flink SQL> CREATE TABLE test1 (id int,name string)
> with (
> 'connector'='hive',
> 'is_generic' = 'false'
> )
> ;
[INFO] Execute statement succeed.
可以查元数据:
select * from hivemetadata.TBLS;
显然元数据中多了一个Hive表【内部表】。
Flink SQL> insert into test1 values(1,'暴龙兽1');WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/export/server/flink-1.17.0/lib/flink-dist-1.17.0.jar) to field java.lang.Class.ANNOTATION
WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[INFO] Submitting SQL update statement to the cluster...
[INFO] SQL update statement has been successfully submitted to the cluster:
Job ID: e16b471d5f1628d552a3393426cbb0bb
Flink SQL> select * from test1;
[ERROR] Could not execute SQL statement. Reason:
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "hdfs"
所以还需要拷Jar包。中途出现了不少class not found exception
,也是类似的做法,把确实的Jar包手动放到flink的lib路径下。
root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5/share/hadoop/hdfs# pwd /export/server/hadoop-3.3.5/share/hadoop/hdfs root@zhiyong-hive-on-flink1:/export/server/hadoop-3.3.5/share/hadoop/hdfs# cp ./*.jar /export/server/flink-1.17.0/lib/ Flink SQL> select * from zhiyong_flink_db.test1; [ERROR] Could not execute SQL statement. Reason: java.lang.NoSuchMethodError: org.apache.hadoop.fs.FsTracer.get(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/Tracer; Flink SQL> set table.sql-dialect=hive; [INFO] Execute statement succeed. Flink SQL> select * from zhiyong_flink_db.test1;WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.util.ExceptionUtils (file:/export/server/flink-1.17.0/lib/flink-dist-1.17.0.jar) to field java.lang.Throwable.detailMessage WARNING: Please consider reporting this to the maintainers of org.apache.flink.util.ExceptionUtils WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release [ERROR] Could not execute SQL statement. Reason: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath. Available factory identifiers are: Note: if you want to use Hive dialect, please first move the jar `flink-table-planner_2.12` located in `FLINK_HOME/opt` to `FLINK_HOME/lib` and then move out the jar `flink-table-planner-loader` from `FLINK_HOME/lib`.
到处是坑!!!
参考官网:https://nightlies.apache.org/flink/flink-docs-release-1.17/zh/docs/dev/table/sql-gateway/overview/
这个脚本的内容:
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# cat sql-gateway.sh #!/usr/bin/env bash function usage() { echo "Usage: sql-gateway.sh [start|start-foreground|stop|stop-all] [args]" echo " commands:" echo " start - Run a SQL Gateway as a daemon" echo " start-foreground - Run a SQL Gateway as a console application" echo " stop - Stop the SQL Gateway daemon" echo " stop-all - Stop all the SQL Gateway daemons" echo " -h | --help - Show this help message" } ################################################################################ # Adopted from "flink" bash script ################################################################################ target="$0" # For the case, the executable has been directly symlinked, figure out # the correct bin path by following its symlink up to an upper bound. # Note: we can't use the readlink utility here if we want to be POSIX # compatible. iteration=0 while [ -L "$target" ]; do if [ "$iteration" -gt 100 ]; then echo "Cannot resolve path: You have a cyclic symlink in $target." break fi ls=`ls -ld -- "$target"` target=`expr "$ls" : '.* -> \(.*\)$'` iteration=$((iteration + 1)) done # Convert relative path to absolute path bin=`dirname "$target"` # get flink config . "$bin"/config.sh if [ "$FLINK_IDENT_STRING" = "" ]; then FLINK_IDENT_STRING="$USER" fi ################################################################################ # SQL gateway specific logic ################################################################################ ENTRYPOINT=sql-gateway if [[ "$1" = *--help ]] || [[ "$1" = *-h ]]; then usage exit 0 fi STARTSTOP=$1 if [ -z "$STARTSTOP" ]; then STARTSTOP="start" fi if [[ $STARTSTOP != "start" ]] && [[ $STARTSTOP != "start-foreground" ]] && [[ $STARTSTOP != "stop" ]] && [[ $STARTSTOP != "stop-all" ]]; then usage exit 1 fi # ./sql-gateway.sh start --help, print the message to the console if [[ "$STARTSTOP" = start* ]] && ( [[ "$*" = *--help* ]] || [[ "$*" = *-h* ]] ); then FLINK_TM_CLASSPATH=`constructFlinkClassPath` SQL_GATEWAY_CLASSPATH=`findSqlGatewayJar` "$JAVA_RUN" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$SQL_GATEWAY_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" org.apache.flink.table.gateway.SqlGateway "${@:2}" exit 0 fi if [[ $STARTSTOP == "start-foreground" ]]; then exec "${FLINK_BIN_DIR}"/flink-console.sh $ENTRYPOINT "${@:2}" else "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP $ENTRYPOINT "${@:2}" fi root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin#
为了避免繁琐的参数,直接在Flink的配置文件写死:
root@zhiyong-hive-on-flink1:/export/server# vim ./flink-1.17.0/conf/flink-conf.yaml
#新增2个kv
sql-gateway.endpoint.type: hiveserver2
sql-gateway.endpoint.hiveserver2.catalog.hive-conf-dir: /export/server/apache-hive-3.1.3-bin/conf
此时可以减少一些参数。尝试启动:
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# pwd /export/server/flink-1.17.0/bin root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# ./sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2 Starting sql-gateway daemon on host zhiyong-hive-on-flink1. root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# jps 2496 RunJar 3794 TaskManagerRunner 4276 Jps 3499 StandaloneSessionClusterEntrypoint 4237 SqlGateway root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# ./sql-gateway.sh stop Stopping sql-gateway daemon (pid: 4237) on host zhiyong-hive-on-flink1. root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/bin# ./sql-gateway.sh start-foreground \ > -Dsql-gateway.session.check-interval=10min \ > -Dsql-gateway.endpoint.type=hiveserver2 \ > -Dsql-gateway.endpoint.hiveserver2.catalog.hive-conf-dir=/export/server/apache-hive-3.1.3-bin/conf \ > -Dsql-gateway.endpoint.hiveserver2.catalog.default-database=zhiyong_flink_db \ > -Dsql-gateway.endpoint.hiveserver2.catalog.name=hive 01:39:22.132 [hiveserver2-endpoint-thread-pool-thread-1] ERROR org.apache.thrift.server.TThreadPoolServer - Thrift error occurred during processing of message. org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client? at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:228) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.3.jar:3.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?]
貌似进程已经启动。但是却不能访问。
问题集中在Hive。
所以需要重新部署JDK1.8。此过程不赘述。
sql-gateway.sh start-foreground \
-Dsql-gateway.session.check-interval=10min \
-Dsql-gateway.endpoint.type=hiveserver2 \
-Dsql-gateway.endpoint.hiveserver2.catalog.hive-conf-dir=/export/server/apache-hive-3.1.3-bin/conf \
-Dsql-gateway.endpoint.hiveserver2.catalog.default-database=zhiyong_flink_db \
-Dsql-gateway.endpoint.hiveserver2.catalog.name=hive \
-Dsql-gateway.endpoint.hiveserver2.module.name=hive
但是:
02:44:47.970 [hiveserver2-endpoint-thread-pool-thread-1] ERROR org.apache.flink.table.endpoint.hive.HiveServer2Endpoint - Failed to GetInfo. java.lang.UnsupportedOperationException: Unrecognized TGetInfoType value: CLI_ODBC_KEYWORDS. at org.apache.flink.table.endpoint.hive.HiveServer2Endpoint.GetInfo(HiveServer2Endpoint.java:379) [flink-connector-hive_2.12-1.17.0.jar:1.17.0] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo.getResult(TCLIService.java:1537) [hive-exec-3.1.3.jar:3.1.3] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo.getResult(TCLIService.java:1522) [hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) [hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.3.jar:3.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202] 02:45:22.027 [sql-gateway-operation-pool-thread-1] ERROR org.apache.flink.table.gateway.service.operation.OperationManager - Failed to execute the operation c6cf01d3-afe5-4da0-8619-1948c8353c1d. org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath. Available factory identifiers are: Note: if you want to use Hive dialect, please first move the jar `flink-table-planner_2.12` located in `FLINK_HOME/opt` to `FLINK_HOME/lib` and then move out the jar `flink-table-planner-loader` from `FLINK_HOME/lib`. at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:546) ~[flink-table-api-java-uber-1.17.0.jar:1.17.0] at org.apache.flink.table.planner.delegation.PlannerBase.getDialectFactory(PlannerBase.scala:161) ~[?:?] at org.apache.flink.table.planner.delegation.PlannerBase.getParser(PlannerBase.scala:171) ~[?:?] at org.apache.flink.table.api.internal.TableEnvironmentImpl.getParser(TableEnvironmentImpl.java:1764) ~[flink-table-api-java-uber-1.17.0.jar:1.17.0] at org.apache.flink.table.api.internal.TableEnvironmentImpl.<init>(TableEnvironmentImpl.java:240) ~[flink-table-api-java-uber-1.17.0.jar:1.17.0] at org.apache.flink.table.api.bridge.internal.AbstractStreamTableEnvironmentImpl.<init>(AbstractStreamTableEnvironmentImpl.java:89) ~[flink-table-api-java-uber-1.17.0.jar:1.17.0] at org.apache.flink.table.api.bridge.java.internal.StreamTableEnvironmentImpl.<init>(StreamTableEnvironmentImpl.java:84) ~[flink-table-api-java-uber-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationExecutor.createStreamTableEnvironment(OperationExecutor.java:393) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationExecutor.getTableEnvironment(OperationExecutor.java:332) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:190) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:258) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_202] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_202] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_202] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202] 02:47:25.096 [sql-gateway-operation-pool-thread-2] ERROR org.apache.flink.table.gateway.service.operation.OperationManager - Failed to execute the operation 35155187-741f-4ea2-b6df-ee5f5b0f2dc8. org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath.
在beeline可以连接:
root@zhiyong-hive-on-flink1:/home/zhiyong# beeline SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/server/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/server/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 3.1.3 by Apache Hive beeline> !connect jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db;auth=noSasl Connecting to jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db;auth=noSasl Enter username for jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db: root Enter password for jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db: ****** Connected to: Apache Flink (version 1.17) Driver: Hive JDBC (version 3.1.3) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> show databases; Error: org.apache.flink.table.gateway.service.utils.SqlExecutionException: Failed to execute the operation c6cf01d3-afe5-4da0-8619-1948c8353c1d. at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.processThrowable(OperationManager.java:414) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:267) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath. Available factory identifiers are: Note: if you want to use Hive dialect, please first move the jar `flink-table-planner_2.12` located in `FLINK_HOME/opt` to `FLINK_HOME/lib` and then move out the jar `flink-table-planner-loader` from `FLINK_HOME/lib`. at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:546) at org.apache.flink.table.planner.delegation.PlannerBase.getDialectFactory(PlannerBase.scala:161) at org.apache.flink.table.planner.delegation.PlannerBase.getParser(PlannerBase.scala:171) at org.apache.flink.table.api.internal.TableEnvironmentImpl.getParser(TableEnvironmentImpl.java:1764) at org.apache.flink.table.api.internal.TableEnvironmentImpl.<init>(TableEnvironmentImpl.java:240) at org.apache.flink.table.api.bridge.internal.AbstractStreamTableEnvironmentImpl.<init>(AbstractStreamTableEnvironmentImpl.java:89) at org.apache.flink.table.api.bridge.java.internal.StreamTableEnvironmentImpl.<init>(StreamTableEnvironmentImpl.java:84) at org.apache.flink.table.gateway.service.operation.OperationExecutor.createStreamTableEnvironment(OperationExecutor.java:393) at org.apache.flink.table.gateway.service.operation.OperationExecutor.getTableEnvironment(OperationExecutor.java:332) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:190) at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:258) ... 7 more (state=,code=0) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> SET table.sql-dialect = default; Error: org.apache.flink.table.gateway.service.utils.SqlExecutionException: Failed to execute the operation 35155187-741f-4ea2-b6df-ee5f5b0f2dc8. at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.processThrowable(OperationManager.java:414) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:267) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath.
但是同样报错!
所以照着操作:
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/opt# pwd /export/server/flink-1.17.0/opt root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/opt# ll 总用量 277744 drwxr-xr-x 3 root root 4096 3月 17 20:22 ./ drwxr-xr-x 10 root root 4096 3月 17 20:22 ../ -rw-r--r-- 1 root root 28040881 3月 17 20:18 flink-azure-fs-hadoop-1.17.0.jar -rw-r--r-- 1 root root 48461 3月 17 20:21 flink-cep-scala_2.12-1.17.0.jar -rw-r--r-- 1 root root 46756459 3月 17 20:18 flink-gs-fs-hadoop-1.17.0.jar -rw-r--r-- 1 root root 26300214 3月 17 20:17 flink-oss-fs-hadoop-1.17.0.jar -rw-r--r-- 1 root root 32998666 3月 17 20:16 flink-python-1.17.0.jar -rw-r--r-- 1 root root 20400 3月 17 20:20 flink-queryable-state-runtime-1.17.0.jar -rw-r--r-- 1 root root 30938059 3月 17 20:17 flink-s3-fs-hadoop-1.17.0.jar -rw-r--r-- 1 root root 96609524 3月 17 20:17 flink-s3-fs-presto-1.17.0.jar -rw-r--r-- 1 root root 233709 3月 17 17:37 flink-shaded-netty-tcnative-dynamic-2.0.54.Final-16.1.jar -rw-r--r-- 1 root root 952711 3月 17 20:16 flink-sql-client-1.17.0.jar -rw-r--r-- 1 root root 210103 3月 17 20:14 flink-sql-gateway-1.17.0.jar -rw-r--r-- 1 root root 191815 3月 17 20:21 flink-state-processor-api-1.17.0.jar -rw-r--r-- 1 root root 21072371 3月 17 20:13 flink-table-planner_2.12-1.17.0.jar drwxr-xr-x 2 root root 4096 3月 17 20:16 python/ root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/opt# cp flink-table-planner_2.12-1.17.0.jar /export/server/flink-1.17.0/li lib/ licenses/ root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/opt# cp flink-table-planner_2.12-1.17.0.jar /export/server/flink-1.17.0/li lib/ licenses/ root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/opt# cp flink-table-planner_2.12-1.17.0.jar /export/server/flink-1.17.0/lib/ root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/opt# cd /export/server/flink-1.17.0/lib/ root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# ll 总用量 321488 drwxr-xr-x 2 root root 4096 5月 19 02:54 ./ drwxr-xr-x 10 root root 4096 3月 17 20:22 ../ -rw-r--r-- 1 root root 167761 5月 14 23:41 antlr-runtime-3.5.2.jar -rw-r--r-- 1 root root 616888 5月 15 00:33 commons-configuration2-2.1.1.jar -rw-r--r-- 1 root root 61829 5月 15 00:26 commons-logging-1.2.jar -rw-r--r-- 1 root root 196487 3月 17 20:07 flink-cep-1.17.0.jar -rw-r--r-- 1 root root 542616 3月 17 20:10 flink-connector-files-1.17.0.jar -rw-r--r-- 1 root root 8876209 5月 14 23:26 flink-connector-hive_2.12-1.17.0.jar -rw-r--r-- 1 root root 102468 3月 17 20:14 flink-csv-1.17.0.jar -rw-r--r-- 1 root root 135969953 3月 17 20:22 flink-dist-1.17.0.jar -rw-r--r-- 1 root root 180243 3月 17 20:13 flink-json-1.17.0.jar -rw-r--r-- 1 root root 21043313 3月 17 20:20 flink-scala_2.12-1.17.0.jar -rw-r--r-- 1 root root 15407474 3月 17 20:21 flink-table-api-java-uber-1.17.0.jar -rw-r--r-- 1 root root 21072371 5月 19 02:54 flink-table-planner_2.12-1.17.0.jar -rw-r--r-- 1 root root 37975208 3月 17 20:15 flink-table-planner-loader-1.17.0.jar -rw-r--r-- 1 root root 3146205 3月 17 20:07 flink-table-runtime-1.17.0.jar -rw-r--r-- 1 root root 138291 5月 15 00:35 hadoop-auth-3.1.0.jar -rw-r--r-- 1 root root 4034318 5月 15 00:15 hadoop-common-3.1.1.jar -rw-r--r-- 1 root root 4535144 5月 15 01:33 hadoop-common-3.3.5.jar -rw-r--r-- 1 root root 3474147 5月 15 01:33 hadoop-common-3.3.5-tests.jar -rw-r--r-- 1 root root 6296402 5月 15 01:29 hadoop-hdfs-3.3.5.jar -rw-r--r-- 1 root root 6137497 5月 15 01:29 hadoop-hdfs-3.3.5-tests.jar -rw-r--r-- 1 root root 5532342 5月 15 01:29 hadoop-hdfs-client-3.3.5.jar -rw-r--r-- 1 root root 129796 5月 15 01:29 hadoop-hdfs-client-3.3.5-tests.jar -rw-r--r-- 1 root root 251501 5月 15 01:29 hadoop-hdfs-httpfs-3.3.5.jar -rw-r--r-- 1 root root 9586 5月 15 01:29 hadoop-hdfs-native-client-3.3.5.jar -rw-r--r-- 1 root root 9586 5月 15 01:29 hadoop-hdfs-native-client-3.3.5-tests.jar -rw-r--r-- 1 root root 115593 5月 15 01:29 hadoop-hdfs-nfs-3.3.5.jar -rw-r--r-- 1 root root 1133476 5月 15 01:29 hadoop-hdfs-rbf-3.3.5.jar -rw-r--r-- 1 root root 450962 5月 15 01:29 hadoop-hdfs-rbf-3.3.5-tests.jar -rw-r--r-- 1 root root 96472 5月 15 01:33 hadoop-kms-3.3.5.jar -rw-r--r-- 1 root root 1654887 5月 15 00:30 hadoop-mapreduce-client-core-3.1.1.jar -rw-r--r-- 1 root root 170289 5月 15 01:33 hadoop-nfs-3.3.5.jar -rw-r--r-- 1 root root 189835 5月 15 01:33 hadoop-registry-3.3.5.jar -rw-r--r-- 1 root root 41873153 5月 14 23:29 hive-exec-3.1.3.jar -rw-r--r-- 1 root root 36983 5月 14 23:29 hive-metastore-3.1.3.jar -rw-r--r-- 1 root root 4101057 5月 15 00:42 htrace-core4-4.1.0-incubating.jar -rw-r--r-- 1 root root 56674 5月 15 01:33 javax.activation-api-1.2.0.jar -rw-r--r-- 1 root root 313702 5月 15 00:52 libfb303-0.9.3.jar -rw-r--r-- 1 root root 208006 3月 17 17:31 log4j-1.2-api-2.17.1.jar -rw-r--r-- 1 root root 301872 3月 17 17:31 log4j-api-2.17.1.jar -rw-r--r-- 1 root root 1790452 3月 17 17:31 log4j-core-2.17.1.jar -rw-r--r-- 1 root root 24279 3月 17 17:31 log4j-slf4j-impl-2.17.1.jar -rw-r--r-- 1 root root 161867 5月 15 00:24 stax2-api-3.1.4.jar -rw-r--r-- 1 root root 512742 5月 15 00:20 woodstox-core-5.0.3.jar root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# mv flink-table-planner-loader-1.17.0.jar flink-table-planner-loader-1.17.0.jar_bak root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib#
已经手动处理了不少依赖的Jar包,但是还会报错:
beeline> !connect jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db;auth=noSasl Connecting to jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db;auth=noSasl Enter username for jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db: root Enter password for jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db: ****** Connected to: Apache Flink (version 1.17) Driver: Hive JDBC (version 3.1.3) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> show databases; Error: org.apache.flink.table.gateway.service.utils.SqlExecutionException: Failed to execute the operation 7c60e354-b2af-4c1b-a364-4a4d48a8ff8b. at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.processThrowable(OperationManager.java:414) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:267) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ExceptionInInitializerError at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.table.catalog.hive.client.HiveShimV120.registerTemporaryFunction(HiveShimV120.java:262) at org.apache.flink.table.planner.delegation.hive.HiveParser.parse(HiveParser.java:212) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:191) at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:258) ... 7 more Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:85) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:203) ... 17 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:83) ... 20 more Caused by: java.lang.NoClassDefFoundError: org/apache/commons/codec/language/Soundex at org.apache.hadoop.hive.ql.udf.generic.GenericUDFSoundex.<init>(GenericUDFSoundex.java:49) ... 25 more Caused by: java.lang.ClassNotFoundException: org.apache.commons.codec.language.Soundex at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 26 more (state=,code=0) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> SET table.sql-dialect = default; +---------+ | result | +---------+ | OK | +---------+ 1 row selected (0.206 seconds) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> show databases; +-------------------+ | database name | +-------------------+ | default | | zhiyong_flink_db | +-------------------+ 2 rows selected (0.383 seconds) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> use zhiyong_flink_db; +---------+ | result | +---------+ | OK | +---------+ 1 row selected (0.03 seconds) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> show tables; +-------------+ | table name | +-------------+ | test1 | +-------------+ 1 row selected (0.044 seconds) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f> select * from test1; Error: org.apache.flink.table.gateway.service.utils.SqlExecutionException: Failed to execute the operation 627c0f0b-8c1d-4882-a0d4-085abf05ab75. at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.processThrowable(OperationManager.java:414) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:267) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.fs.FsTracer.get(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/Tracer; at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:323) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:308) at org.apache.hadoop.hdfs.DistributedFileSystem.initDFSClient(DistributedFileSystem.java:202) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.getNumFiles(HiveSourceFileEnumerator.java:195) at org.apache.flink.connectors.hive.HiveTableSource.lambda$getDataStream$0(HiveTableSource.java:174) at org.apache.flink.connectors.hive.HiveParallelismInference.logRunningTime(HiveParallelismInference.java:107) at org.apache.flink.connectors.hive.HiveParallelismInference.infer(HiveParallelismInference.java:89) at org.apache.flink.connectors.hive.HiveTableSource.getDataStream(HiveTableSource.java:172) at org.apache.flink.connectors.hive.HiveTableSource$1.produceDataStream(HiveTableSource.java:138) at org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecTableSourceScan.translateToPlanInternal(CommonExecTableSourceScan.java:140) at org.apache.flink.table.planner.plan.nodes.exec.batch.BatchExecTableSourceScan.translateToPlanInternal(BatchExecTableSourceScan.java:101) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:161) at org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:257) at org.apache.flink.table.planner.plan.nodes.exec.batch.BatchExecSink.translateToPlanInternal(BatchExecSink.java:65) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:161) at org.apache.flink.table.planner.delegation.BatchPlanner.$anonfun$translateToPlan$1(BatchPlanner.scala:93) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.Iterator.foreach(Iterator.scala:937) at scala.collection.Iterator.foreach$(Iterator.scala:937) at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) at scala.collection.IterableLike.foreach(IterableLike.scala:70) at scala.collection.IterableLike.foreach$(IterableLike.scala:69) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.delegation.BatchPlanner.translateToPlan(BatchPlanner.scala:92) at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:197) at org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1803) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:945) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1422) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeOperation(OperationExecutor.java:437) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:200) at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:258) ... 7 more (state=,code=0) 0: jdbc:hive2://192.168.88.24:10000/zhiyong_f>
reboot后重启:
cd /export/server/hadoop-3.3.5
./sbin/start-dfs.sh
cd /export/server/apache-hive-3.1.3-bin/bin
hive --service metastore > /dev/null 2>&1 &
cd /export/server/flink-1.17.0/bin
./start-cluster.sh
cd /export/server/flink-1.17.0/bin
./sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2
beeline
!connect jdbc:hive2://192.168.88.24:10000/zhiyong_flink_db;auth=noSasl
这次又报错:
2023-05-20 18:55:34,127 INFO org.apache.flink.table.catalog.hive.HiveCatalog [] - Created HiveCatalog 'hive' 2023-05-20 18:55:34,248 INFO org.apache.hadoop.hive.metastore.HiveMetaStoreClient [] - Trying to connect to metastore with URI thrift://192.168.88.24:9083 2023-05-20 18:55:34,273 INFO org.apache.hadoop.hive.metastore.HiveMetaStoreClient [] - Opened a connection to metastore, current connections: 1 2023-05-20 18:55:34,332 INFO org.apache.hadoop.hive.metastore.HiveMetaStoreClient [] - Connected to metastore. 2023-05-20 18:55:34,333 INFO org.apache.hadoop.hive.metastore.RetryingMetaStoreClient [] - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.metastore.HiveMetaStoreClient ugi=root (auth:SIMPLE) retries=1 delay=1 lifetime=0 2023-05-20 18:55:34,516 INFO org.apache.flink.table.catalog.hive.HiveCatalog [] - Connected to Hive metastore 2023-05-20 18:55:34,691 INFO org.apache.flink.table.module.ModuleManager [] - Loaded module 'hive' from class org.apache.flink.table.module.hive.HiveModule 2023-05-20 18:55:34,714 INFO org.apache.flink.table.gateway.service.session.SessionManagerImpl [] - Session 5d1a631c-94da-4d6c-ab46-096e03fe8e5c is opened, and the number of current sessions is 1. 2023-05-20 18:55:35,025 ERROR org.apache.flink.table.endpoint.hive.HiveServer2Endpoint [] - Failed to GetInfo. java.lang.UnsupportedOperationException: Unrecognized TGetInfoType value: CLI_ODBC_KEYWORDS. at org.apache.flink.table.endpoint.hive.HiveServer2Endpoint.GetInfo(HiveServer2Endpoint.java:379) [flink-connector-hive_2.12-1.17.0.jar:1.17.0] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo.getResult(TCLIService.java:1537) [hive-exec-3.1.3.jar:3.1.3] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo.getResult(TCLIService.java:1522) [hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) [hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.3.jar:3.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202] Hive Session ID = f22ad5ef-3225-4ce8-ab52-9d143d2d8aba 2023-05-20 18:55:43,658 INFO SessionState [] - Hive Session ID = f22ad5ef-3225-4ce8-ab52-9d143d2d8aba 2023-05-20 18:55:43,731 ERROR org.apache.flink.table.gateway.service.operation.OperationManager [] - Failed to execute the operation 55b109a8-0e30-4e44-8334-4a5cc8609ff5. java.lang.ExceptionInInitializerError: null at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202] at org.apache.flink.table.catalog.hive.client.HiveShimV120.registerTemporaryFunction(HiveShimV120.java:262) ~[flink-connector-hive_2.12-1.17.0.jar:1.17.0] at org.apache.flink.table.planner.delegation.hive.HiveParser.parse(HiveParser.java:212) ~[flink-connector-hive_2.12-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:191) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:258) ~[flink-sql-gateway-1.17.0.jar:1.17.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_202] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_202] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_202] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202] Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:85) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:203) ~[hive-exec-3.1.3.jar:3.1.3] ... 17 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_202] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_202] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_202] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_202] at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:83) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:203) ~[hive-exec-3.1.3.jar:3.1.3] ... 17 more Caused by: java.lang.NoClassDefFoundError: org/apache/commons/codec/language/Soundex at org.apache.hadoop.hive.ql.udf.generic.GenericUDFSoundex.<init>(GenericUDFSoundex.java:49) ~[hive-exec-3.1.3.jar:3.1.3] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_202] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_202] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_202] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_202] at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:83) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:203) ~[hive-exec-3.1.3.jar:3.1.3] ... 17 more Caused by: java.lang.ClassNotFoundException: org.apache.commons.codec.language.Soundex at java.net.URLClassLoader.findClass(URLClassLoader.java:382) ~[?:1.8.0_202] at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_202] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[?:1.8.0_202] at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_202] at org.apache.hadoop.hive.ql.udf.generic.GenericUDFSoundex.<init>(GenericUDFSoundex.java:49) ~[hive-exec-3.1.3.jar:3.1.3] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_202] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_202] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_202] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_202] at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:83) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:203) ~[hive-exec-3.1.3.jar:3.1.3] ... 17 more
但是org.apache.commons.codec.language.Soundex
这个包。。。
去:https://archive.apache.org/dist/commons/codec
搞个Jar包放入flink的lib路径:
root@zhiyong-hive-on-flink1:~# cp /home/zhiyong/commons-codec-1.15.jar /export/server/flink-1.17.0/lib/
重启flink sql gateway后还是报错:
root@zhiyong-hive-on-flink1:/export/server/apache-hive-3.1.3-bin/conf# beeline SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/server/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/server/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 3.1.3 by Apache Hive beeline> !connect jdbc:hive2://192.168.88.24:10000/default;auth=noSasl Connecting to jdbc:hive2://192.168.88.24:10000/default;auth=noSasl Enter username for jdbc:hive2://192.168.88.24:10000/default: Enter password for jdbc:hive2://192.168.88.24:10000/default: Connected to: Apache Flink (version 1.17) Driver: Hive JDBC (version 3.1.3) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://192.168.88.24:10000/default> show databases; +-------------------+ | database name | +-------------------+ | default | | zhiyong_flink_db | +-------------------+ 2 rows selected (3.057 seconds) 0: jdbc:hive2://192.168.88.24:10000/default> use zhiyong_flink_db; +---------+ | result | +---------+ | OK | +---------+ 1 row selected (0.048 seconds) 0: jdbc:hive2://192.168.88.24:10000/default> show tables; +-------------+ | table name | +-------------+ | test1 | +-------------+ 1 row selected (0.046 seconds) 0: jdbc:hive2://192.168.88.24:10000/default> select * from test1; Error: org.apache.flink.table.gateway.service.utils.SqlExecutionException: Failed to execute the operation 8baaefed-0ed7-4080-94a7-ea908c8a4898. at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.processThrowable(OperationManager.java:414) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:267) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.fs.FsTracer.get(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/Tracer; at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:323) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:308) at org.apache.hadoop.hdfs.DistributedFileSystem.initDFSClient(DistributedFileSystem.java:202) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at org.apache.flink.connectors.hive.HiveSourceFileEnumerator.getNumFiles(HiveSourceFileEnumerator.java:195) at org.apache.flink.connectors.hive.HiveTableSource.lambda$getDataStream$0(HiveTableSource.java:174) at org.apache.flink.connectors.hive.HiveParallelismInference.logRunningTime(HiveParallelismInference.java:107) at org.apache.flink.connectors.hive.HiveParallelismInference.infer(HiveParallelismInference.java:89) at org.apache.flink.connectors.hive.HiveTableSource.getDataStream(HiveTableSource.java:172) at org.apache.flink.connectors.hive.HiveTableSource$1.produceDataStream(HiveTableSource.java:138) at org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecTableSourceScan.translateToPlanInternal(CommonExecTableSourceScan.java:140) at org.apache.flink.table.planner.plan.nodes.exec.batch.BatchExecTableSourceScan.translateToPlanInternal(BatchExecTableSourceScan.java:101) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:161) at org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:257) at org.apache.flink.table.planner.plan.nodes.exec.batch.BatchExecSink.translateToPlanInternal(BatchExecSink.java:65) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:161) at org.apache.flink.table.planner.delegation.BatchPlanner.$anonfun$translateToPlan$1(BatchPlanner.scala:93) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.Iterator.foreach(Iterator.scala:937) at scala.collection.Iterator.foreach$(Iterator.scala:937) at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) at scala.collection.IterableLike.foreach(IterableLike.scala:70) at scala.collection.IterableLike.foreach$(IterableLike.scala:69) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.delegation.BatchPlanner.translateToPlan(BatchPlanner.scala:92) at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:197) at org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1803) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:945) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1422) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeOperation(OperationExecutor.java:437) at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:200) at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212) at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:119) at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:258) ... 7 more (state=,code=0) 0: jdbc:hive2://192.168.88.24:10000/default>
可能是包名不同导致的问题:
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# cp /home/zhiyong/htrace-core-3.2.0-incubating.jar /export/server/flink-1.17.0/lib
如果使用这个包Flink sql gateway又会启动失败。显然还是应该用大版本为4的包。
切换为3.3.5的Hadoop依赖后报错:
2023-05-21 15:23:10,132 INFO org.apache.flink.table.gateway.service.session.SessionManagerImpl [] - SessionManager is stopped. Exception in thread "main" org.apache.flink.table.gateway.api.utils.SqlGatewayException: Failed to start the endpoints. at org.apache.flink.table.gateway.SqlGateway.start(SqlGateway.java:76) at org.apache.flink.table.gateway.SqlGateway.startSqlGateway(SqlGateway.java:123) at org.apache.flink.table.gateway.SqlGateway.main(SqlGateway.java:95) Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/thirdparty/com/google/common/collect/Interners at org.apache.hadoop.util.StringInterner.<clinit>(StringInterner.java:40) at org.apache.hadoop.conf.Configuration$Parser.handleEndElement(Configuration.java:3335) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3417) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3191) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3084) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3045) at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2923) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2905) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1247) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1301) at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1527) at org.apache.hadoop.fs.FileSystem$Cache.<init>(FileSystem.java:3615) at org.apache.hadoop.fs.FileSystem.<clinit>(FileSystem.java:206) at org.apache.hadoop.hive.conf.valcoersion.JavaIOTmpdirVariableCoercion.<clinit>(JavaIOTmpdirVariableCoercion.java:37) at org.apache.hadoop.hive.conf.SystemVariables.<clinit>(SystemVariables.java:37) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<init>(HiveConf.java:4492) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<init>(HiveConf.java:4452) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:428) at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:150) at org.apache.flink.table.catalog.hive.HiveCatalog.createHiveConf(HiveCatalog.java:258) at org.apache.flink.table.endpoint.hive.HiveServer2EndpointFactory.createSqlGatewayEndpoint(HiveServer2EndpointFactory.java:71) at org.apache.flink.table.gateway.api.endpoint.SqlGatewayEndpointFactoryUtils.createSqlGatewayEndpoint(SqlGatewayEndpointFactoryUtils.java:71) at org.apache.flink.table.gateway.SqlGateway.start(SqlGateway.java:69) ... 2 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.thirdparty.com.google.common.collect.Interners at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 25 more Shutting down the Flink SqlGateway... 2023-05-21 15:23:10,136 INFO org.apache.flink.table.gateway.SqlGateway [] - Shutting down the Flink SqlGateway... 2023-05-21 15:23:10,144 INFO org.apache.flink.table.gateway.service.session.SessionManagerImpl [] - SessionManager is stopped. 2023-05-21 15:23:10,145 INFO org.apache.flink.table.gateway.SqlGateway [] - Flink SqlGateway has been shutdown. Flink SqlGateway has been shutdown.
所以:
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# cp /home/zhiyong/google-collect-1.0.jar /export/server/flink-1.17.0/lib/
但还是没有这个类。查看Jar包也确实没有!!!没办法的时候,就要查看源码!!!
依赖的Jar包应该是这个。
root@zhiyong-hive-on-flink1:/export/server/flink-1.17.0/lib# cp /home/zhiyong/hadoop-shaded-guava-1.1.1.jar /export/server/flink-1.17.0/lib/
之后又报错:
023-05-21 16:13:01,242 ERROR org.apache.flink.table.gateway.service.operation.OperationManager [] - Failed to execute the operation 12efdf00-b71f-4a38-8c0d-f59405d6aae1. java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.ipc.ProtobufRpcEngine2 at java.lang.Class.forName0(Native Method) ~[?:1.8.0_202] at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_202] at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2630) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2595) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2691) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.ipc.RPC.getProtocolEngine(RPC.java:224) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:712) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithAlignmentContext(NameNodeProxiesClient.java:365) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.hdfs.NameNodeProxiesClient.createNonHAProxyWithClientProtocol(NameNodeProxiesClient.java:343) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:135) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:374) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:308) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.initDFSClient(DistributedFileSystem.java:202) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:187) ~[hadoop-hdfs-client-3.3.5.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3572) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3673) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3624) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:557) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365) ~[hadoop-common-3.3.5.jar:?]
这个方法:
/** * Load a class by name, returning null rather than throwing an exception * if it couldn't be loaded. This is to avoid the overhead of creating * an exception. * * @param name the class name * @return the class object, or null if it could not be found. */ public Class<?> getClassByNameOrNull(String name) { Map<String, WeakReference<Class<?>>> map; synchronized (CACHE_CLASSES) { map = CACHE_CLASSES.get(classLoader); if (map == null) { map = Collections.synchronizedMap( new WeakHashMap<String, WeakReference<Class<?>>>()); CACHE_CLASSES.put(classLoader, map); } } Class<?> clazz = null; WeakReference<Class<?>> ref = map.get(name); if (ref != null) { clazz = ref.get(); } if (clazz == null) { try { clazz = Class.forName(name, true, classLoader); } catch (ClassNotFoundException e) { // Leave a marker that the class isn't found map.put(name, new WeakReference<Class<?>>(NEGATIVE_CACHE_SENTINEL)); return null; } // two putters can race here, but they'll put the same class map.put(name, new WeakReference<Class<?>>(clazz)); return clazz; } else if (clazz == NEGATIVE_CACHE_SENTINEL) { return null; // not found } else { // cache hit return clazz; } }
显然是根据类名反射加载时出错。。。而hadoop-common
包确实有这个class,所以不排除是JDK的问题:
root@zhiyong-hive-on-flink1:/home/zhiyong# java -version
java version "1.8.0_371"
Java(TM) SE Runtime Environment (build 1.8.0_371-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.371-b11, mixed mode)
root@zhiyong-hive-on-flink1:/home/zhiyong# /export/server/hadoop-3.3.5/sbin/start-dfs.sh WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER. Starting namenodes on [192.168.88.24] Starting datanodes Starting secondary namenodes [192.168.88.24] root@zhiyong-hive-on-flink1:/home/zhiyong# /export/server/hadoop-3.3.5/bin/hadoop fs -mkdir -p /user/hive/warehouse /export/server/hadoop-3.3.5/bin/hadoop fs -chmod g+w /tmp /export/server/hadoop-3.3.5/bin/hadoop fs -chmod g+w /user/hive/warehouse root@zhiyong-hive-on-flink1:/home/zhiyong# /export/server/hadoop-3.3.5/bin/hadoop fs -mkdir -p /tmp root@zhiyong-hive-on-flink1:/home/zhiyong# /export/server/hadoop-3.3.5/bin/hadoop fs -chmod g+w /tmp root@zhiyong-hive-on-flink1:/home/zhiyong# /export/server/hadoop-3.3.5/bin/hadoop fs -chmod g+w /user/hive/warehouse root@zhiyong-hive-on-flink1:/home/zhiyong# nohup hive --service metastore >/dev/null 2>&1 & [1] 5464 root@zhiyong-hive-on-flink1:/home/zhiyong# jps 4833 DataNode 4689 NameNode 5572 Jps 5464 RunJar 5051 SecondaryNameNode root@zhiyong-hive-on-flink1:/home/zhiyong# hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/server/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/server/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = 0acad22e-a1f2-4471-bbed-a6addc8ec877 Logging initialized using configuration in jar:file:/export/server/apache-hive-3.1.3-bin/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = e709ed02-00dd-46fe-9678-a502c804acfb hive> show databases; OK default zhiyong_flink_db Time taken: 0.876 seconds, Fetched: 2 row(s) hive> use zhiyong_flink_db; OK Time taken: 0.087 seconds hive> show tables; OK test1 Time taken: 0.092 seconds, Fetched: 1 row(s) hive> select * from test1; OK Time taken: 3.775 seconds hive> insert into test1 values(1,'col1'); Query ID = root_20230521175148_b80cf8c6-be8e-4283-8881-3cb78e38d228 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Job running in-process (local Hadoop) 2023-05-21 17:51:52,222 Stage-1 map = 0%, reduce = 0% 2023-05-21 17:51:54,512 Stage-1 map = 100%, reduce = 100% Ended Job = job_local732219861_0001 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to directory hdfs://192.168.88.24:9000/user/hive/warehouse/zhiyong_flink_db.db/test1/.hive-staging_hive_2023-05-21_17-51-48_816_2854283787924120213-1/-ext-10000 Loading data to table zhiyong_flink_db.test1 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 170 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 8.418 seconds hive> select * from test1; OK 1 col1 Time taken: 0.205 seconds, Fetched: 1 row(s)
此时Hive正常。
使用:
nohup hive --service metastore >/dev/null 2>&1 &
nohup hive --service hiveserver2 >/dev/null 2>&1 &
给SQL Boy们练习Hive的SQL语法基本是够用了。。。
root@zhiyong-hive-on-flink1:/export/server# vim /etc/profile
root@zhiyong-hive-on-flink1:/export/server# source /etc/profile
root@zhiyong-hive-on-flink1:/export/server# /export/server/flink-1.17.0/bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host zhiyong-hive-on-flink1.
Starting taskexecutor daemon on host zhiyong-hive-on-flink1.
root@zhiyong-hive-on-flink1:/export/server# /export/server/flink-1.17.0/bin/sql-gateway.sh start-foreground
但还是爆相同的异常。显然不是JDK的问题。那就是类加载器根据类名反射加载Java的class时出错。这种问题基本就是依赖包冲突、遗漏等原因导致的。JRE的JVM在runtime阶段加载错包的问题之前也有遇到过:https://lizhiyong.blog.csdn.net/article/details/124184528
依赖的Hadoop包版本冲突/Jar包遗漏,导致DQL执行失败。和Hadoop无关的DDL在Hive MetaStore正常的情况下可以使用。所以后续就是要去更换Hadoop的版本,找一个和Flink1.17匹配度更好的。或者排查其它奇怪的问题。
虽然本次调研不是很成功,只成功运行了DDL,但是至少证明了这么做在功能上的可行性。解决版本冲突后就可以像JDBC那样,按照Hive的SQL语法去操作Hive表,对SQL Boy来说,与写Hive On MR【百e或者千e级别的多表join任务只能这么玩】、Hive On Tez【适合千万到几十e数据量的小任务】、Hive On Spark【这种做法依旧淘汰了】区别不大。由于运算全部交由Flink,在离线跑批【HQL任务不会跑流计算】场景下的性能表现尚不清楚,与Tez、Spark的优劣只能后续继续调研。
但是,这么做也看到了依赖冲突问题很严重,这么多Jar包依赖打镜像会很庞大。。。所以,即便使用,也还是比较适合StandAlone环境或者Yarn环境,K8S环境还是做流计算更合适。
本次调研选取的版本较新,遇到了不少的问题,解决了一部分【例如Hive使用MySQL8做MetaStore】,不得已让步了一部分【例如JDK从17一路降级到11和1.8】。在生产环境我们追求稳定,一般是能跑就暂时不动它【不然搞宕了又是5w字检讨】,前些天CDP7.1.5升CDP7.1.7保持Tez和Hive版本不变的情况下都出现了大量HQL脚本报错,不得已回滚,还得手动补数据,也是辛苦了那帮SQL Boy。但是在调研或者学习时,玩一玩高版本并没有神马坏处,提前踩到坑,未来生产环境的版本升上来了,也知道怎样快速处理遇到过的坑。哪怕遇到了暂时没有解决的问题【例如Flink和Hadoop版本不匹配的问题】,起码也是知道了做版本选型时这对组合不太合适。
这种All In One的模式,弊端也很明显,例如:JDK要统一版本。为了能够正常使用Hive,必须降级到1.8,而我Flink1.17是要用JDK11的ZGC。。。这就很尴尬。。。还好,在JDK1.8环境Flink也凑合着能跑起来。这种多环境的情况,Docker或者K8S容器化运行的优势就很明显了。不过话说回来,Hive这一套离线跑批环境容器化的意义并不大。所以分集群的做法还是更科学一点。集群之间通过Restful API调接口实现语言/版本无关性、Client和Server之间通过Thrift交互实现语言/版本无关性是很科学的做法。
版本依赖的问题得后续慢慢读源码来排查了。Apache开源组件的版本兼容性实在是一言难尽。CDP和各类Saas还是有好处的,依赖问题会少很多,毕竟交了保护费。这也就是俗话说的做不如买,买不如租。。。
转载请注明出处:https://lizhiyong.blog.csdn.net/article/details/130799342
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。