当前位置:   article > 正文

第一步 CDH6.3.2集成自制Flink1.12.7_cdh parcel制作工具

cdh parcel制作工具

 一共分两步走:

        第一步:CDH制作Flink1.12.7

        第二步:CDH集成Flink1.12.7

前提说明

早期CDH6.3.2集成的Flink1.12存在log4j漏洞,所以需要重新编译

这是之前Flink1.12的log4j版本为1.12,(受影响的版本:Apache Log4j 2.x <= 2.15.0-rc1)

Flink官方针对历史版本也给出了修复log4j漏洞的紧急版本,其中就有Flink1.12.7,当然还有1.11.6,1.13.5和1.14.2。这是官方说明链接

下载最新的Flink1.12.7源码,打开pom.xml检查下log4j版本为2.16,确定下载的版本没错。接下来开始重新编译

准备环境

  1. 虚拟机,系统CentOS7.6
  2. jdk1.8
  3. maven3.6.1
  4. parcel制作工具 flink-parcel.zip
  5. flink-1.12.7-bin-scala_2.12.tgz

1. 准备制作环境

安装 jdk1.8 , Maven3.6.3 ,httpd

  1. [root@cdhuser ~]# cat /etc/redhat-release
  2. CentOS Linux release 7.9.2009 (Core)
  3. # 安装jdk
  4. [root@cdhuser ~]# java -version
  5. java version "1.8.0_221"
  6. Java(TM) SE Runtime Environment (build 1.8.0_221-b11)
  7. Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)
  8. # 安装maven
  9. [root@cdhuser src]# mvn -version
  10. Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
  11. Maven home: /usr/local/src/apache-maven-3.6.3
  12. Java version: 1.8.0_221, vendor: Oracle Corporation, runtime: /usr/local/src/jdk1.8.0_221/jre
  13. Default locale: zh_CN, platform encoding: UTF-8
  14. OS name: "linux", version: "3.10.0-1062.el7.x86_64", arch: "amd64", family: "unix"
  15. # 安装httpd
  16. [root@cdhuser html]# systemctl status httpd
  17. ● httpd.service - The Apache HTTP Server
  18. Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
  19. Active: active (running) since 四 2022-06-09 19:22:24 CST; 1h 1min ago
  20. Docs: man:httpd(8)
  21. man:apachectl(8)
  22. Process: 6634 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
  23. Main PID: 1318 (httpd)
  24. Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
  25. Tasks: 6
  26. Memory: 5.8M
  27. CGroup: /system.slice/httpd.service
  28. ├─1318 /usr/sbin/httpd -DFOREGROUND
  29. ├─6635 /usr/sbin/httpd -DFOREGROUND
  30. ├─6636 /usr/sbin/httpd -DFOREGROUND
  31. ├─6637 /usr/sbin/httpd -DFOREGROUND
  32. ├─6638 /usr/sbin/httpd -DFOREGROUND
  33. └─6639 /usr/sbin/httpd -DFOREGROUND
  34. 6月 09 19:22:24 cdhuser systemd[1]: Starting The Apache HTTP Server...
  35. 6月 09 19:22:24 cdhuser httpd[1318]: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.137.31. Set the 'ServerName' directive globally to suppress this message
  36. 6月 09 19:22:24 cdhuser systemd[1]: Started The Apache HTTP Server.
  37. 6月 09 20:16:01 cdhuser systemd[1]: Reloading The Apache HTTP Server.
  38. 6月 09 20:16:01 cdhuser httpd[6634]: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.137.31. Set the 'ServerName' directive globally to suppress this message
  39. 6月 09 20:16:01 cdhuser systemd[1]: Reloaded The Apache HTTP Server.

2. 下载Flink1.12.7 和 制作工具

Flink1.12.7最新下载地址:

https://dlcdn.apache.org/flink/flink-1.12.7/flink-1.12.7-bin-scala_2.12.tgz

CDH的parcel制作工具地址:

https://github.com/cloudera/cm_ext # 制作cloudera-manager扩展包的工具

https://github.com/pkeropen/flink-parcel # 制作flink包的工具

Flink-shaded下载地址:

https://archive.apache.org/dist/flink/flink-shaded-10.0/flink-shaded-10.0-src.tgz

相关的制作工具我已经准备好,自行取用

链接:百度网盘 请输入提取码 提取码:up8n

3. 制作parcel

制作httpd本地仓库,将Flink1.12.7安装包拷贝到本地仓库中。后续制作时使用

  1. mkdir /var/www/html/flink-1.12.7 &&
  2. mv /root/flink-1.12.7-bin-scala_2.12.tgz /var/www/html/flink-1.12.7 &&
  3. curl http://cdhuser/flink-1.12.7/
  4. 结果:
  5. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  6. <html>
  7. <head>
  8. <title>Index of /flink-1.12.7</title>
  9. </head>
  10. <body>
  11. <h1>Index of /flink-1.12.7</h1>
  12. <table>
  13. <tr><th valign="top"><img src="/icons/blank.gif" alt="[ICO]"></th><th><a href="?C=N;O=D">Name</a></th><th><a href="?C=M;O=A">Last modified</a></th><th><a href="?C=S;O=A">Size</a></th><th><a href="?C=D;O=A">Description</a></th></tr>
  14. <tr><th colspan="5"><hr></th></tr>
  15. <tr><td valign="top"><img src="/icons/back.gif" alt="[PARENTDIR]"></td><td><a href="/">Parent Directory</a> </td><td>&nbsp;</td><td align="right"> - </td><td>&nbsp;</td></tr>
  16. <tr><td valign="top"><img src="/icons/compressed.gif" alt="[ ]"></td><td><a href="flink-1.12.7-bin-scala_2.12.tgz">flink-1.12.7-bin-sca..&gt;</a></td><td align="right">2022-06-09 20:09 </td><td align="right">309M</td><td>&nbsp;</td></tr>
  17. <tr><th colspan="5"><hr></th></tr>
  18. </table>
  19. </body></html>

修改工具配置

  1. tar -zxvf /root/flink-parcel-master.tar &&
  2. tar -zxvf /root/cm_ext.tar &&
  3. cp -r /root/cm_ext /root/flink-parcel-master &&
  4. cd /root/flink-parcel-master &&
  5. mv flink-parcel.properties flink-parcel.properties.bak &&
  6. (cat <<EOF
  7. #FLINK 下载地址
  8. FLINK_URL=http://cdhuser/flink-1.12.7/flink-1.12.7-bin-scala_2.12.tgz # 填写本地httpd仓库地址
  9. #flink版本号
  10. FLINK_VERSION=1.12.7
  11. #扩展版本号
  12. EXTENS_VERSION=BIN-SCALA_2.12
  13. #操作系统版本,以centos为例
  14. OS_VERSION=7
  15. #CDH 小版本
  16. CDH_MIN_FULL=5.2
  17. CDH_MAX_FULL=6.3.3
  18. #CDH大版本
  19. CDH_MIN=5
  20. CDH_MAX=6
  21. EOF
  22. ) >> flink-parcel.properties &&
  23. chmod u+x build.sh && # 给编译文件添加权限
  24. sed -i "s/https:/git:/g" build.sh &&
  25. ./build.sh parcel && # 生成FLINK-1.12.7-BIN-SCALA_2.12_build文件夹
  26. ./build.sh csd_on_yarn && # 生成FLINK_ON_YARN-1.12.7.jar文件
  27. tar -cvf ./flink-1.12.7-bin-scala_2.12.tar ./FLINK-1.12.7-BIN-SCALA_2.12_build/ # 将FLINK-1.12.7-BIN-SCALA_2.12_build/打包

最终生成的flink-1.12.7-bin-scala_2.12.tar 和 FLINK_ON_YARN-1.12.7.jar 就是目标文件

4. 编译flink-shaded版本

如果Flink报错:缺少Hadoop依赖时,需要编译以下jar包

  1. tar -zxvf /root/flink-shaded-10.0-src.tgz &&
  2. cd /root/flink-shaded-10.0 &&
  3. vim pom.xml
  4. # 在里面的profiles中添加如下配置参数:
  5. <profile>
  6. <id>vendor-repos</id>
  7. <activation>
  8. <property>
  9. <name>vendor-repos</name>
  10. </property>
  11. </activation>
  12. <!-- Add vendor maven repositories -->
  13. <repositories>
  14. <!-- Cloudera -->
  15. <repository>
  16. <id>cloudera-releases</id>
  17. <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
  18. <releases>
  19. <enabled>true</enabled>
  20. </releases>
  21. <snapshots>
  22. <enabled>false</enabled>
  23. </snapshots>
  24. </repository>
  25. <!-- Hortonworks -->
  26. <repository>
  27. <id>HDPReleases</id>
  28. <name>HDP Releases</name>
  29. <url>https://repo.hortonworks.com/content/repositories/releases/</url>
  30. <snapshots><enabled>false</enabled></snapshots>
  31. <releases><enabled>true</enabled></releases>
  32. </repository>
  33. <repository>
  34. <id>HortonworksJettyHadoop</id>
  35. <name>HDP Jetty</name>
  36. <url>https://repo.hortonworks.com/content/repositories/jetty-hadoop</url>
  37. <snapshots><enabled>false</enabled></snapshots>
  38. <releases><enabled>true</enabled></releases>
  39. </repository>
  40. <!-- MapR -->
  41. <repository>
  42. <id>mapr-releases</id>
  43. <url>https://repository.mapr.com/maven/</url>
  44. <snapshots><enabled>false</enabled></snapshots>
  45. <releases><enabled>true</enabled></releases>
  46. </repository>
  47. </repositories>
  48. </profile>

在flink-shaded-10.0目录下进行编译

  1. # 开始编译
  2. mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=3.0.0-cdh6.3.2 -Dscala-2.12 -Drat.skip=true -T10C
  3. # 编译成功后,生成的目标文件位置:
  4. /root/flink-shaded-10.0/flink-shaded-hadoop-2-parent/flink-shaded-hadoop-2-uber/target/flink-shaded-hadoop-2-uber-3.0.0-cdh6.3.2-10.0.jar

耐心等待编译,可能会失败多次,重复执行命令编译就行

5. 制作完成

将最终目标文件拷贝出来

编译后的目标文件我也已经准备好,自行取用:

链接:百度网盘 请输入提取码 提取码:cct5

  1. [root@cdhuser cdh_parcel]# ll
  2. 总用量 374648
  3. -rw-r--r-- 1 root root 324198400 6月 10 09:37 flink-1.12.7-bin-scala_2.12.tar
  4. -rw-r--r-- 1 root root 8259 6月 10 09:37 FLINK_ON_YARN-1.12.7.jar
  5. -rw-r--r-- 1 root root 59427729 6月 10 09:36 flink-shaded-hadoop-2-uber-3.0.0-cdh6.3.2-10.0.jar
  6. # flink-1.12.7-bin-scala_2.12.tar内容:
  7. 总用量 316600
  8. -rw-r--r-- 1 root root 324189375 6月 9 22:10 FLINK-1.12.7-BIN-SCALA_2.12-el7.parcel
  9. -rw-r--r-- 1 root root 41 6月 9 22:10 FLINK-1.12.7-BIN-SCALA_2.12-el7.parcel.sha
  10. -rw-r--r-- 1 root root 583 6月 9 22:10 manifest.json

报错列表

1. 缺少Hadoop依赖包

启动时报以下错误,表示缺少Hadoop依赖

  1. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/exceptions/YarnException
  2. at java.lang.Class.getDeclaredMethods0(Native Method)
  3. at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
  4. at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
  5. at java.lang.Class.getMethod0(Class.java:3018)
  6. at java.lang.Class.getMethod(Class.java:1784)
  7. at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
  8. at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
  9. Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.exceptions.YarnException
  10. at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  11. at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  12. at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
  13. at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  14. ... 7 more

需要在布置Flink-Yarn的节点上上传flink-shaded-hadoop-2-uber-3.0.0-cdh6.3.2-10.0

将其上传到/opt/cloudera/parcels/FLINK/lib/flink/lib目录下

参考资料

CDH整合Flink(CDH6.3.2+Flink1.12.0)

基于CentOS 7.2的CDH 6.3.2 Flink编译源码

自制Flink Parcel集成CDH(Flink1.12.0 + CDH6.3.2)

CDH6.3.2添加安装flink-yarn服务

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/662125
推荐阅读
相关标签
  

闽ICP备14008679号