当前位置:   article > 正文

Flink源码解析(从CliFrontend提交开始)-第一期_flink 源码

flink 源码

序言

经过一段时间对于flink学习且画了一些源码流程相关的图,决定开一个flink源码分析的专栏,该专栏以flink on yarn的 Per-job模式为基础,基于flink1.12.0,以官方SocketWindowWordCount例子来分析一个任务从提交到运行的流程源码分析。那么话不多,直接开始吧。

首先我们正常情况下,在该模式下的提交flink任务的脚本入下:

flink run -t yarn-per-job  -c org.apache.flink.streaming.examples.socket.SocketWindowWordCount  examples/streaming/SocketWindowWordCount.jar --port 9231

因为该命令肯定是在bin目录下执行的,所以我们直接去找bin目录下的flink文件。

我们发现入下内容:

exec $JAVA_RUN $JVM_ARGS $FLINK_ENV_JAVA_OPTS "${log_setting[@]}" -classpath "`manglePathList "$CC_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" org.apache.flink.client.cli.CliFrontend "$@"

也就是说提交作业的入口是CliFrontend,其运行方式和我们当时用cmd学习运行的第一个HelloWorld的java程序一样,就是通过java 类名来启动一个jvm进程, 所以在源码中找到该类的main函数开始分析:

1.CliFrontend#main

ps:上面大标题这种写法表示:类名#函数名,后面都是同理哦(想要快捷到这里,在idea中,先用Ctrl+N输入类名到达指定类,然后用Ctrl+F12输入函数方法名即可到达指定方法),然后我分析每个函数主要是分析重点(这里的重点是指点进去后内部的有相当重要的信息,而不是只是看个函数名就知全意了)不会分析每一步。

  1. public static void main(final String[] args) {
  2. EnvironmentInformation.logEnvironmentInfo(LOG, "Command Line Client", args);
  3. // 1. find the configuration directory
  4. final String configurationDirectory = getConfigurationDirectoryFromEnv();
  5. // 2. load the global configuration
  6. final Configuration configuration = GlobalConfiguration.loadConfiguration(configurationDirectory);
  7. // 3. load the custom command lines
  8. final List<CustomCommandLine> customCommandLines = loadCustomCommandLines(
  9. configuration,
  10. configurationDirectory);
  11. try {
  12. final CliFrontend cli = new CliFrontend(
  13. configuration,
  14. customCommandLines);
  15. SecurityUtils.install(new SecurityConfiguration(cli.configuration));
  16. int retCode = SecurityUtils.getInstalledContext()
  17. .runSecured(() -> cli.parseAndRun(args));
  18. System.exit(retCode);
  19. }
  20. catch (Throwable t) {
  21. final Throwable strippedThrowable = ExceptionUtils.stripException(t, UndeclaredThrowableException.class);
  22. LOG.error("Fatal error while running command line interface.", strippedThrowable);
  23. strippedThrowable.printStackTrace();
  24. System.exit(31);
  25. }
  26. }

2.CliFrontend#parseAndRun->CliFrontend#run

  1. protected void run(String[] args) throws Exception {
  2. LOG.info("Running 'run' command.");
  3. final Options commandOptions = CliFrontendParser.getRunCommandOptions();
  4. final CommandLine commandLine = getCommandLine(commandOptions, args, true);
  5. // evaluate help flag
  6. if (commandLine.hasOption(HELP_OPTION.getOpt())) {
  7. CliFrontendParser.printHelpForRun(customCommandLines);
  8. return;
  9. }
  10. final CustomCommandLine activeCommandLine =
  11. validateAndGetActiveCommandLine(checkNotNull(commandLine));
  12. final ProgramOptions programOptions = ProgramOptions.create(commandLine);
  13. final List<URL> jobJars = getJobJarAndDependencies(programOptions);
  14. final Configuration effectiveConfiguration = getEffectiveConfiguration(
  15. activeCommandLine, commandLine, programOptions, jobJars);
  16. LOG.debug("Effective executor configuration: {}", effectiveConfiguration);
  17. final PackagedProgram program = getPackagedProgram(programOptions, effectiveConfiguration);
  18. try {
  19. executeProgram(effectiveConfiguration, program);
  20. } finally {
  21. program.deleteExtractedLibraries();
  22. }
  23. }

3.CliFrontend#validateAndGetActiveCommandLine

其中验证的方式是通过isActive函数来判断,点击isActive函数,发现其实一个接口声明的方法,需要找到其实现类。

分别如下:

因为之前说过已经依次添加了GenericCLI>FlinkYarnSessionCLI>DefaultCLI三种客户端,那么for就会让它们依次调用isActive进行判断,根据上面的代码分析逻辑如下:

分析完validateAndGetActiveCommandLine后,继续看下面的executeProgram

4.CliFrontend#executeProgram->ClientUtils#executeProgram

  1. public static void executeProgram(
  2. PipelineExecutorServiceLoader executorServiceLoader,
  3. Configuration configuration,
  4. PackagedProgram program,
  5. boolean enforceSingleJobExecution,
  6. boolean suppressSysout) throws ProgramInvocationException {
  7. checkNotNull(executorServiceLoader);
  8. final ClassLoader userCodeClassLoader = program.getUserCodeClassLoader();
  9. final ClassLoader contextClassLoader = Thread.currentThread().getContextClassLoader();
  10. try {
  11. Thread.currentThread().setContextClassLoader(userCodeClassLoader);
  12. LOG.info("Starting program (detached: {})", !configuration.getBoolean(DeploymentOptions.ATTACHED));
  13. ContextEnvironment.setAsContext(
  14. executorServiceLoader,
  15. configuration,
  16. userCodeClassLoader,
  17. enforceSingleJobExecution,
  18. suppressSysout);
  19. StreamContextEnvironment.setAsContext(
  20. executorServiceLoader,
  21. configuration,
  22. userCodeClassLoader,
  23. enforceSingleJobExecution,
  24. suppressSysout);
  25. try {
  26. program.invokeInteractiveModeForExecution();
  27. } finally {
  28. ContextEnvironment.unsetAsContext();
  29. StreamContextEnvironment.unsetAsContext();
  30. }
  31. } finally {
  32. Thread.currentThread().setContextClassLoader(contextClassLoader);
  33. }
  34. }

 

 5.PackagedProgram#invokeInteractiveModeForExecution->PackagedProgram#callMainMethod

  1. private static void callMainMethod(Class<?> entryClass, String[] args) throws ProgramInvocationException {
  2. Method mainMethod;
  3. if (!Modifier.isPublic(entryClass.getModifiers())) {
  4. throw new ProgramInvocationException("The class " + entryClass.getName() + " must be public.");
  5. }
  6. try {
  7. mainMethod = entryClass.getMethod("main", String[].class);
  8. } catch (NoSuchMethodException e) {
  9. throw new ProgramInvocationException("The class " + entryClass.getName() + " has no main(String[]) method.");
  10. } catch (Throwable t) {
  11. throw new ProgramInvocationException("Could not look up the main(String[]) method from the class " +
  12. entryClass.getName() + ": " + t.getMessage(), t);
  13. }
  14. if (!Modifier.isStatic(mainMethod.getModifiers())) {
  15. throw new ProgramInvocationException("The class " + entryClass.getName() + " declares a non-static main method.");
  16. }
  17. if (!Modifier.isPublic(mainMethod.getModifiers())) {
  18. throw new ProgramInvocationException("The class " + entryClass.getName() + " declares a non-public main method.");
  19. }
  20. try {
  21. mainMethod.invoke(null, (Object) args);
  22. } catch (IllegalArgumentException e) {
  23. throw new ProgramInvocationException("Could not invoke the main method, arguments are not matching.", e);
  24. } catch (IllegalAccessException e) {
  25. throw new ProgramInvocationException("Access to the main method was denied: " + e.getMessage(), e);
  26. } catch (InvocationTargetException e) {
  27. Throwable exceptionInMethod = e.getTargetException();
  28. if (exceptionInMethod instanceof Error) {
  29. throw (Error) exceptionInMethod;
  30. } else if (exceptionInMethod instanceof ProgramParametrizationException) {
  31. throw (ProgramParametrizationException) exceptionInMethod;
  32. } else if (exceptionInMethod instanceof ProgramInvocationException) {
  33. throw (ProgramInvocationException) exceptionInMethod;
  34. } else {
  35. throw new ProgramInvocationException("The main method caused an error: " + exceptionInMethod.getMessage(), exceptionInMethod);
  36. }
  37. } catch (Throwable t) {
  38. throw new ProgramInvocationException("An error occurred while invoking the program's main method: " + t.getMessage(), t);
  39. }
  40. }

这个方法就是首先判断类是否是public,然后获取其中的main方法,判断main方法是否static、是否public,最后开始执行用户的main方法,因为我们之前说过以官方的SocketWindowWordCount为例子来分析,所以假设用户即我们自己编写了这段SocketWindowWordCount代码,要去flink上执行,这里就是执行这段代码的main方法了

6.SocketWindowWordCount#main

  1. public static void main(String[] args) throws Exception {
  2. // the host and the port to connect to
  3. final String hostname;
  4. final int port;
  5. try {
  6. final ParameterTool params = ParameterTool.fromArgs(args);
  7. hostname = params.has("hostname") ? params.get("hostname") : "localhost";
  8. port = params.getInt("port");
  9. } catch (Exception e) {
  10. System.err.println("No port specified. Please run 'SocketWindowWordCount " +
  11. "--hostname <hostname> --port <port>', where hostname (localhost by default) " +
  12. "and port is the address of the text server");
  13. System.err.println("To start a simple text server, run 'netcat -l <port>' and " +
  14. "type the input text into the command line");
  15. return;
  16. }
  17. // get the execution environment
  18. final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
  19. // get input data by connecting to the socket
  20. DataStream<String> text = env.socketTextStream(hostname, port, "\n");
  21. // parse the data, group it, window it, and aggregate the counts
  22. DataStream<WordWithCount> windowCounts = text
  23. .flatMap(new FlatMapFunction<String, WordWithCount>() {
  24. @Override
  25. public void flatMap(String value, Collector<WordWithCount> out) {
  26. for (String word : value.split("\\s")) {
  27. out.collect(new WordWithCount(word, 1L));
  28. }
  29. }
  30. })
  31. .keyBy(value -> value.word)
  32. .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
  33. .reduce(new ReduceFunction<WordWithCount>() {
  34. @Override
  35. public WordWithCount reduce(WordWithCount a, WordWithCount b) {
  36. return new WordWithCount(a.word, a.count + b.count);
  37. }
  38. });
  39. // print the results with a single thread, rather than in parallel
  40. windowCounts.print().setParallelism(1);
  41. env.execute("Socket Window WordCount");
  42. }

至于怎么执行用户代码,将算子串联起来形成流图、作业图以及执行图等等,见下一期,听说关注、点赞、收藏有助于催更哦

这一期部分的整体图预览如下:

总览

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/633996
推荐阅读
相关标签
  

闽ICP备14008679号