赞
踩
Flink1.8版本相比1.7在打包和jar包拆分上作出些许调整,对使用者有一定影响;如下是在使用flink-1.8 on hdp yarn时踩的两个小坑
直接从官网上下载安装flink安装包flink-1.8.1-bin-scala_2.12.tgz,解压后,使用如下命令提交作业:
./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar
作业会抛出异常:
Could not identify hostname and port in 'yarn-cluster'
Flink1.8中,FIX了FLINK-11266,将flink的包中对hadoop版本进行了剔除,导致flink中直接缺少hadoop的client相关类,无法解析yarn-cluster参数。
方法:
参考官方文档(这里)。
export HADOOP_CLASSPATH=`hadoop classpath`
2. 在bin/config.sh添加如下语句,导入hadoop的classpath
export HADOOP_CLASSPATH=`hadoop classpath`
下载时,选择对应的hadoop版本, 我自己用的是 flink-shaded-hadoop-2-uber-2.8.3-7.0.jar。
然后把这个包拷贝到你的flink安装目录下的 lib 目录下面。 我的lib目录下就有这些jar.
- [root@sltuenym6bh lib]# ll
- total 128280
- -rw-r--r-- 1 502 games 87382386 Jun 25 16:00 flink-dist_2.12-1.8.1.jar
- -rw-r--r-- 1 root root 43467085 Aug 25 02:06 flink-shaded-hadoop-2-uber-2.8.3-7.0.jar
- -rw-r--r-- 1 502 games 489884 May 30 10:26 log4j-1.2.17.jar
- -rw-r--r-- 1 502 games 9931 May 30 10:26 slf4j-log4j12-1.7.15.jar
4. 再运行
# ./bin/flink run -m yarn-cluster ./examples/batch/WordCount.ja
输出结果如下:
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/tanghb2/flink/flink-1.8.1/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-08-25 02:20:52,898 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2019-08-25 02:20:52,898 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2019-08-25 02:20:53,023 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, numberTaskManagers=1, slotsPerTaskManager=1} 2019-08-25 02:20:53,425 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration directory ('/home/tanghb2/flink/flink-1.8.1/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them. 2019-08-25 02:20:54,486 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1565019835797_0515 2019-08-25 02:20:54,712 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1565019835797_0515 2019-08-25 02:20:54,713 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated 2019-08-25 02:20:54,714 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED 2019-08-25 02:20:58,751 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully. Starting execution of program Executing WordCount example with default input data set. Use --input to specify file input. Printing result to stdout. Use --output to specify output path. (a,5) (action,1) (after,1) (against,1) (all,2)
搞定!
个人没有碰到过。
基于以上第三种方式的作业提交,会抛出如下异常:
- java.lang.NoClassDefFoundError: com/sun/jersey/core/util/FeaturesAndProperties
- at java.lang.ClassLoader.defineClass1(Native Method)
- at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
- at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
- at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
- at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
- at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
- at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
- at java.security.AccessController.doPrivileged(Native Method)
- at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
- at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
- at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:55)
- at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createTimelineClient(YarnClientImpl.java:181)
- at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:168)
- at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
- at org.apache.flink.yarn.cli.FlinkYarnSessionCli.getClusterDescriptor(FlinkYarnSessionCli.java:1012)
- at org.apache.flink.yarn.cli.FlinkYarnSessionCli.createDescriptor(FlinkYarnSessionCli.java:274)
- at org.apache.flink.yarn.cli.FlinkYarnSessionCli.createClusterDescriptor(FlinkYarnSessionCli.java:454)
- at org.apache.flink.yarn.cli.FlinkYarnSessionCli.createClusterDescriptor(FlinkYarnSessionCli.java:97)
- at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:224)
- at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
- at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
- at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:422)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
- at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
- at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
- Caused by: java.lang.ClassNotFoundException: com.sun.jersey.core.util.FeaturesAndProperties
- at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
- at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
- ... 29 more
在yarn的client提交作业时,会调用createTimelineClient,此时由于缺少相关jar包抛出异常,而如果之前有调用export HADOOP_CLASSPATH,则hdp自身的hadoop版本中包含有相关jar包,则可以正常运行。
- 1. 修改HADOOP_CONF_DIR目录下yarn-site.xml中的yarn.timeline-service.enabled设置为false
- 2. 创建flink自己的hadoop_conf,copy原有的HADOOP_CONF_DIR中的配置文件至hadoop_conf目录,并在
- flink-conf.yaml中设置env.hadoop.conf.dir变量,并指向新创建的hadoop_conf目录。
- flink-conf.yaml添加的内容如下:
- env.hadoop.conf.dir: /etc/hadoop/conf
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。