Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9441

Hadoop Required Dependency List Not Clear

    XMLWordPrintableJSON

Details

    Description

      To be able to use Apache Flink with S3, a few libraries from Hadoop distribution are required to be added to lib/ folder.

      This list is partially documented in https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/aws.html  (Provide S3 FileSystem Dependency). But it refers to Hadoop 2.7 whereas Flink supports 2.8 also.

      How to compose the dependecy list for Hadoop 2.8?

      Is it possible to bundle the dependencies in a separate archive that users can download?

       

      UPDATE:

      Downloaded Apache Flink 1.4.2 for Hadoop 2.7 and it seems it was compiled with Hadoop 2.8. I get the error here: 

       

      {{"java.lang.NumberFormatException: For input string: "100M" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:589) at java.lang.Long.parseLong(Long.java:631) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1319) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:248) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2811) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100) }}

      ..."

      https://stackoverflow.com/questions/48149929/hive-1-2-metastore-service-doesnt-start-after-configuring-it-to-s3-storage-inst?rq=1

       

      So I cannot start the cluster.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              razvan Razvan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: