Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-35833

ArtifactFetchManager always creates artifact dir

    XMLWordPrintableJSON

Details

    Description

      FLINK-28915 added support for remote job jar fetching (HTTPS, S3, etc) but broke the default behavior of local jar when running application on non-writable filesystems. ArtifactFetchManager always attempts to create an artifact directory, even when jar is using "local" protocol.

      Running application on non-writable filesystem is a common scenario in environments when jar is published with the Docker container image.

      A local jar has no need to be fetched to an intermediate directory, since it's already available on the local filesytem. The LocalArtifactFetcher does not write to the filesystem. However, the ArtifactFetchManager always attempts to create a directory before fetching, regardless of which fetcher would do the work. On non-writable filesystem and environments lacking permissions, the outcome is a runtime exception:

      java.lang.RuntimeException: org.apache.flink.util.FlinkRuntimeException: Failed
      to create parent(s) for given base dir:
      /opt/flink/artifacts/<namesapce>/<job name>
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:158) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgramRetriever(KubernetesApplicationClusterEntrypoint.java:129) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgram(KubernetesApplicationClusterEntrypoint.java:111) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.lambda$main$0(KubernetesApplicationClusterEntrypoint.java:85) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.main(KubernetesApplicationClusterEntrypoint.java:85) [flink-dist-1.19.1.jar:1.19.1]
      Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to create parent(s) for given base dir: /opt/flink/artifacts/app07772/sample-app-flink-1-19
          at org.apache.flink.client.program.artifact.ArtifactUtils.createMissingParents(ArtifactUtils.java:50) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:123) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156) ~[flink-dist-1.19.1.jar:1.19.1]
          ... 5 more
      Caused by: java.io.IOException: Cannot create directory '/opt/flink/artifacts/<namespace>'.
          at org.apache.commons.io.FileUtils.mkdirs(FileUtils.java:2289) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.commons.io.FileUtils.forceMkdir(FileUtils.java:1376) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.commons.io.FileUtils.forceMkdirParent(FileUtils.java:1394) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.client.program.artifact.ArtifactUtils.createMissingParents(ArtifactUtils.java:46) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:123) ~[flink-dist-1.19.1.jar:1.19.1]
          at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156) ~[flink-dist-1.19.1.jar:1.19.1]
          ... 5 more

      A workaround is to always specify a location using configuration that allows the process to create directories e.g., user.artifacts.base-dir: /tmp/foo.

      A solution proposal is to enable each fetcher to decide whether to create the intermediate directory or fail.

      Attachments

        Issue Links

          Activity

            People

              fcsaky Ferenc Csaky
              dylanmei Dylan Meissner
              Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: