Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-32817

Supports running jar file names with Spaces

    XMLWordPrintableJSON

Details

    Description

      When submitting a flink jar to a yarn cluster, if the jar filename has spaces in it, the task will not be able to successfully parse the file path in `YarnLocalResourceDescriptor`, and the following exception will occur in JobManager.

      The Flink jar file name is: StreamSQLExample 2.jar

      bin/flink run -d -m yarn-cluster -p 1 -c org.apache.flink.table.examples.java.basics.StreamSQLExample StreamSQLExample\ 2.jar 
      2023-08-09 18:54:31,787 WARN  org.apache.flink.runtime.extension.resourcemanager.NeActiveResourceManager [] - Failed requesting worker with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=220.160mb (230854450 bytes), taskOffHeapSize=0 bytes, networkMemSize=158.720mb (166429984 bytes), managedMemSize=952.320mb (998579934 bytes), numSlots=1}, current pending count: 0
      java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: Error to parse YarnLocalResourceDescriptor from YarnLocalResourceDescriptor{key=StreamSQLExample 2.jar, path=hdfs://***/.flink/application_1586413220781_33151/StreamSQLExample 2.jar, size=7937, modificationTime=1691578403748, visibility=APPLICATION, type=FILE}
          at org.apache.flink.util.concurrent.FutureUtils.lambda$supplyAsync$21(FutureUtils.java:1052) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ~[?:1.8.0_152]
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_152]
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_152]
          at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_152]
      Caused by: org.apache.flink.util.FlinkException: Error to parse YarnLocalResourceDescriptor from YarnLocalResourceDescriptor{key=StreamSQLExample 2.jar, path=hdfs://sloth-jd-pub/user/sloth/.flink/application_1586413220781_33151/StreamSQLExample 2.jar, size=7937, modificationTime=1691578403748, visibility=APPLICATION, type=FILE}
          at org.apache.flink.yarn.YarnLocalResourceDescriptor.fromString(YarnLocalResourceDescriptor.java:112) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          at org.apache.flink.yarn.Utils.decodeYarnLocalResourceDescriptorListFromString(Utils.java:600) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:491) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          at org.apache.flink.yarn.YarnResourceManagerDriver.createTaskExecutorLaunchContext(YarnResourceManagerDriver.java:452) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          at org.apache.flink.yarn.YarnResourceManagerDriver.lambda$startTaskExecutorInContainerAsync$1(YarnResourceManagerDriver.java:383) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          at org.apache.flink.util.concurrent.FutureUtils.lambda$supplyAsync$21(FutureUtils.java:1050) ~[flink-dist_2.12-1.14.0.jar:1.14.0]
          ... 4 more

      From what I understand, the HDFS cluster allows for file names with spaces, as well as S3.

       

      I think we could replace the `LOCAL_RESOURCE_DESC_FORMAT` with 

      // code placeholder
      private static final Pattern LOCAL_RESOURCE_DESC_FORMAT =
              Pattern.compile(
                      "YarnLocalResourceDescriptor\\{"
                              + "key=([\\S\\x20]+), path=([\\S\\x20]+), size=([\\d]+), modificationTime=([\\d]+), visibility=(\\S+), type=(\\S+)}"); 

      add '\x20' to only match the spaces

      Attachments

        Issue Links

          Activity

            People

              yesorno Xianxun Ye
              yesorno Xianxun Ye
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: