Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11086

Add support for Hadoop 3

    XMLWordPrintableJSON

    Details

    • Release Note:
      Hide
      Flink now supports Hadoop versions above Hadoop 3.0.0.

      Note that the Flink project does not provide any updated "flink-shaded-hadoop-*" jars. Users need to provide Hadoop dependencies through the HADOOP_CLASSPATH environment variable (recommended) or the lib/ folder.
      Also, the "include-hadoop" Maven profile has been removed.
      Show
      Flink now supports Hadoop versions above Hadoop 3.0.0. Note that the Flink project does not provide any updated "flink-shaded-hadoop-*" jars. Users need to provide Hadoop dependencies through the HADOOP_CLASSPATH environment variable (recommended) or the lib/ folder. Also, the "include-hadoop" Maven profile has been removed.

      Description

      All builds using maven 3.2.5 on commithash ed8ff14ed39d08cd319efe75b40b9742a2ae7558.

      Attempted builds:

      • mvn clean install -Dhadoop.version=3.0.3
      • mvn clean install -Dhadoop.version=3.1.1

      Integration tests with Hadoop input format datasource fail. Example stack trace, taken from hadoop.version 3.1.1 build:

      testJobCollectionExecution(org.apache.flink.test.hadoopcompatibility.mapred.WordCountMapredITCase)  Time elapsed: 0.275 sec  <<< ERR
      OR!
      java.lang.NoClassDefFoundError: org/apache/flink/hadoop/shaded/com/google/re2j/PatternSyntaxException
              at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
              at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
              at org.apache.hadoop.fs.Globber.doGlob(Globber.java:210)
              at org.apache.hadoop.fs.Globber.glob(Globber.java:149)
              at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2085)
              at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:269)
              at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:239)
              at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
              at org.apache.flink.api.java.hadoop.mapred.HadoopInputFormatBase.createInputSplits(HadoopInputFormatBase.java:150)
              at org.apache.flink.api.java.hadoop.mapred.HadoopInputFormatBase.createInputSplits(HadoopInputFormatBase.java:58)
              at org.apache.flink.api.common.operators.GenericDataSourceBase.executeOnCollections(GenericDataSourceBase.java:225)
              at org.apache.flink.api.common.operators.CollectionExecutor.executeDataSource(CollectionExecutor.java:219)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:155)
              at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
              at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
              at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
              at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:131)
              at org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:182)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:158)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:131)
              at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:115)
              at org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:38)
              at org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:52)
              at org.apache.flink.test.hadoopcompatibility.mapred.WordCountMapredITCase.internalRun(WordCountMapredITCase.java:121)
              at org.apache.flink.test.hadoopcompatibility.mapred.WordCountMapredITCase.testProgram(WordCountMapredITCase.java:71)
      

      Maybe hadoop 3.x versions could be added to test matrix as well?

        Attachments

          Issue Links

          There are no Sub-Tasks for this issue.

            Activity

              People

              • Assignee:
                rmetzger Robert Metzger
                Reporter:
                packet Sebastian Klemke
              • Votes:
                5 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: