当前位置:   article > 正文

flink集群安装部署_flink-1.12.7集群部署

flink-1.12.7集群部署

1.下载

官网下载:Downloads | Apache Flink

阿里网盘下载(包含依赖包)阿里云盘分享

提取码:9bl2

2.解压

tar -zxvf flink-1.12.7-bin-scala_2.11.tgz -C ../opt/module

3.修改配置文件

cd flink-1.12.7/conf/

修改  flink-conf.yaml  文件

修改  masters  文件

创建  workers 文件

3.1修改  flink-conf.yaml  文件

具体配置按照自己的集群配置来

文件中的hdfs 文件夹 ,自己先手动创建下

  1. ################################################################################
  2. # Licensed to the Apache Software Foundation (ASF) under one
  3. # or more contributor license agreements. See the NOTICE file
  4. # distributed with this work for additional information
  5. # regarding copyright ownership. The ASF licenses this file
  6. # to you under the Apache License, Version 2.0 (the
  7. # "License"); you may not use this file except in compliance
  8. # with the License. You may obtain a copy of the License at
  9. #
  10. # http://www.apache.org/licenses/LICENSE-2.0
  11. #
  12. # Unless required by applicable law or agreed to in writing, software
  13. # distributed under the License is distributed on an "AS IS" BASIS,
  14. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  15. # See the License for the specific language governing permissions and
  16. # limitations under the License.
  17. ################################################################################
  18. #==============================================================================
  19. # Common
  20. #==============================================================================
  21. # The external address of the host on which the JobManager runs and can be
  22. # reached by the TaskManagers and any clients which want to connect. This setting
  23. # is only used in Standalone mode and may be overwritten on the JobManager side
  24. # by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
  25. # In high availability mode, if you use the bin/start-cluster.sh script and setup
  26. # the conf/masters file, this will be taken care of automatically. Yarn/Mesos
  27. # automatically configure the host name based on the hostname of the node where the
  28. # JobManager runs.
  29. jobmanager.rpc.address: hadoop102
  30. # The RPC port where the JobManager is reachable.
  31. jobmanager.rpc.port: 6123
  32. # The heap size for the JobManager JVM
  33. jobmanager.heap.size: 800m
  34. # The total process memory size for the TaskManager.
  35. #
  36. # Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
  37. taskmanager.memory.process.size: 1024m
  38. # To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
  39. # It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
  40. #
  41. #taskmanager.memory.flink.size: 1280m
  42. # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
  43. taskmanager.numberOfTaskSlots: 1
  44. # The parallelism used for programs that did not specify and other parallelism.
  45. parallelism.default: 1
  46. # The default file system scheme and authority.
  47. #
  48. # By default file paths without scheme are interpreted relative to the local
  49. # root file system 'file:///'. Use this to override the default and interpret
  50. # relative paths relative to a different file system,
  51. # for example 'hdfs://mynamenode:12345'
  52. #
  53. # fs.default-scheme
  54. #==============================================================================
  55. # High Availability
  56. #==============================================================================
  57. # The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
  58. #
  59. high-availability: zookeeper
  60. # The path where metadata for master recovery is persisted. While ZooKeeper stores
  61. # the small ground truth for checkpoint and leader election, this location stores
  62. # the larger objects, like persisted dataflow graphs.
  63. #
  64. # Must be a durable file system that is accessible from all nodes
  65. # (like HDFS, S3, Ceph, nfs, ...)
  66. #
  67. high-availability.storageDir: hdfs://192.168.233.130:8020/flink/ha/
  68. # The list of ZooKeeper quorum peers that coordinate the high-availability
  69. # setup. This must be a list of the form:
  70. # "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
  71. #
  72. high-availability.zookeeper.quorum: 192.168.233.130:2181,192.168.233.131:2181,192.168.233.132:2181
  73. # ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
  74. # It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
  75. # The default value is "open" and it can be changed to "creator" if ZK security is enabled
  76. #
  77. high-availability.zookeeper.client.acl: open
  78. #==============================================================================
  79. # Fault tolerance and checkpointing
  80. #==============================================================================
  81. # The backend that will be used to store operator state checkpoints if
  82. # checkpointing is enabled.
  83. #
  84. # Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
  85. # <class-name-of-factory>.
  86. #
  87. state.backend: filesystem
  88. # Directory for checkpoints filesystem, when using any of the default bundled
  89. # state backends.
  90. #
  91. state.checkpoints.dir: hdfs://192.168.233.130:8020/flink-checkpoints
  92. # Default target directory for savepoints, optional.
  93. #
  94. state.savepoints.dir: hdfs://192.168.233.130:8020/flink-checkpoints
  95. # Flag to enable/disable incremental checkpoints for backends that
  96. # support incremental checkpoints (like the RocksDB state backend).
  97. #
  98. # state.backend.incremental: false
  99. # The failover strategy, i.e., how the job computation recovers from task failures.
  100. # Only restart tasks that may have been affected by the task failure, which typically includes
  101. # downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
  102. jobmanager.execution.failover-strategy: region
  103. #==============================================================================
  104. # Rest & web frontend
  105. #==============================================================================
  106. # The port to which the REST client connects to. If rest.bind-port has
  107. # not been specified, then the server will bind to this port as well.
  108. #
  109. rest.port: 8081
  110. # The address to which the REST client will connect to
  111. #
  112. rest.address: 0.0.0.0
  113. # Port range for the REST and web server to bind to.
  114. #
  115. rest.bind-port: 8080-8090
  116. # The address that the REST & web server binds to
  117. #
  118. rest.bind-address: 0.0.0.0
  119. # Flag to specify whether job submission is enabled from the web-based
  120. # runtime monitor. Uncomment to disable.
  121. web.submit.enable: true
  122. #==============================================================================
  123. # Advanced
  124. #==============================================================================
  125. # Override the directories for temporary files. If not specified, the
  126. # system-specific Java temporary directory (java.io.tmpdir property) is taken.
  127. #
  128. # For framework setups on Yarn or Mesos, Flink will automatically pick up the
  129. # containers' temp directories without any need for configuration.
  130. #
  131. # Add a delimited list for multiple directories, using the system directory
  132. # delimiter (colon ':' on unix) or a comma, e.g.:
  133. # /data1/tmp:/data2/tmp:/data3/tmp
  134. #
  135. # Note: Each directory entry is read from and written to by a different I/O
  136. # thread. You can include the same directory multiple times in order to create
  137. # multiple I/O threads against that directory. This is for example relevant for
  138. # high-throughput RAIDs.
  139. #
  140. # io.tmp.dirs: /tmp
  141. # The classloading resolve order. Possible values are 'child-first' (Flink's default)
  142. # and 'parent-first' (Java's default).
  143. #
  144. # Child first classloading allows users to use different dependency/library
  145. # versions in their application than those in the classpath. Switching back
  146. # to 'parent-first' may help with debugging dependency issues.
  147. #
  148. # classloader.resolve-order: child-first
  149. # The amount of memory going to the network stack. These numbers usually need
  150. # no tuning. Adjusting them may be necessary in case of an "Insufficient number
  151. # of network buffers" error. The default min is 64MB, the default max is 1GB.
  152. #
  153. # taskmanager.memory.network.fraction: 0.1
  154. #taskmanager.memory.network.min: 128mb
  155. #taskmanager.memory.network.max: 1gb
  156. #==============================================================================
  157. # Flink Cluster Security Configuration
  158. #==============================================================================
  159. # Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
  160. # may be enabled in four steps:
  161. # 1. configure the local krb5.conf file
  162. # 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
  163. # 3. make the credentials available to various JAAS login contexts
  164. # 4. configure the connector to use JAAS/SASL
  165. # The below configure how Kerberos credentials are provided. A keytab will be used instead of
  166. # a ticket cache if the keytab path and principal are set.
  167. # security.kerberos.login.use-ticket-cache: true
  168. # security.kerberos.login.keytab: /path/to/kerberos/keytab
  169. # security.kerberos.login.principal: flink-user
  170. # The configuration below defines which JAAS login contexts
  171. # security.kerberos.login.contexts: Client,KafkaClient
  172. #==============================================================================
  173. # ZK Security Configuration
  174. #==============================================================================
  175. # Below configurations are applicable if ZK ensemble is configured for security
  176. # Override below configuration to provide custom ZK service name if configured
  177. # zookeeper.sasl.service-name: zookeeper
  178. # The configuration below must match one of the values set in "security.kerberos.login.contexts"
  179. # zookeeper.sasl.login-context-name: Client
  180. #==============================================================================
  181. # HistoryServer
  182. #==============================================================================
  183. # The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
  184. # Directory to upload completed jobs to. Add this directory to the list of
  185. # monitored directories of the HistoryServer as well (see below).
  186. jobmanager.archive.fs.dir: hdfs://192.168.233.130:8020/completed-jobs/
  187. # The address under which the web-based HistoryServer listens.
  188. #historyserver.web.address: 0.0.0.0
  189. # The port under which the web-based HistoryServer listens.
  190. historyserver.web.port: 8082
  191. # Comma separated list of directories to monitor for completed jobs.
  192. historyserver.archive.fs.dir: hdfs://192.168.233.130:8020/completed-jobs/
  193. # Interval in milliseconds for refreshing the monitored directories.
  194. historyserver.archive.fs.refresh-interval: 10000

3.2修改masters

192.168.233.130:8081

3.3创建workers

里面配置 集群 各机器的 ip 或者 主机名

  1. 192.168.233.130
  2. 192.168.233.131
  3. 192.168.233.132

4.分发集群

  1. scp -r flink-1.12.7 hadoop103:/opt/module
  2. scp -r flink-1.12.7 hadoop104:/opt/module

5.配置环境变量

  1. vim /etc/profile.d/my_env.sh
  2. #FLINK_HOME
  3. export FLINK_HOME=/opt/module/flink-1.12.7
  4. export PATH=$FLINK_HOME/bin:$PATH

6.分发环境变量

  1. scp /etc/profile.d/my_env.sh hadoop102:/etc/profile.d/
  2. scp /etc/profile.d/my_env.sh hadoop103:/etc/profile.d/

7.source 环境变量

  1. 集群节点1source /etc/profile.d/my_env.sh
  2. 集群节点2source /etc/profile.d/my_env.sh
  3. 集群节点3source /etc/profile.d/my_env.sh

8.添加依赖包

可以不添加试试,直接  bin/start-cluster.sh 启动 flink集群

不出意外的话 可能会出现一系列 的错误

https://mvnrepository.com/

搜索  commons-cli-1.4.jar  和 flink-shaded-hadoop-3-uber-3.1.1.7.2.1.0-327-9.0.jar 并下载

这个当然会有点慢

不过我在阿里云盘中已经放置了,可以去里面下载

9.启动flink集群

bin/start-cluster.sh

Starting standalonesession daemon on host hadoop102.
Starting taskexecutor daemon on host hadoop102.
Starting taskexecutor daemon on host hadoop103.
Starting taskexecutor daemon on host hadoop104.
 

实在不放心,在每个机器上 jps 查看下 是否又有对应的进程

 

 

 10.查看 flink web ui

http://192.168.233.130:8081/

11.测试

 bin/flink run examples/streaming/WordCount.jar

日志:

  1. SLF4J: Class path contains multiple SLF4J bindings.
  2. SLF4J: Found binding in [jar:file:/opt/module/flink-1.12.7/lib/chunjun/chunjun-dist/connector/iceberg/chunjun-connector-iceberg.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  3. SLF4J: Found binding in [jar:file:/opt/module/flink-1.12.7/lib/chunjun-dist/connector/iceberg/chunjun-connector-iceberg.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  4. SLF4J: Found binding in [jar:file:/opt/module/flink-1.12.7/lib/chunjun/lib/log4j-slf4j-impl-2.16.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  5. SLF4J: Found binding in [jar:file:/opt/module/flink-1.12.7/lib/log4j-slf4j-impl-2.16.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  6. SLF4J: Found binding in [jar:file:/opt/module/flink-1.12.7/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  7. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  8. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
  9. Executing WordCount example with default input data set.
  10. Use --input to specify file input.
  11. Printing result to stdout. Use --output to specify output path.
  12. Job has been submitted with JobID c537bf865e69549f725bb167bfa64c18
  13. Program execution finished
  14. Job with JobID c537bf865e69549f725bb167bfa64c18 has finished.
  15. Job Runtime: 2678 ms

成功!

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家小花儿/article/detail/662106
推荐阅读
相关标签
  

闽ICP备14008679号