Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5518

HadoopInputFormat throws NPE when close() is called before open()

    Details

      Description

      When developing a simple Flink applications reading ORC files it crashes with NullPointerException when number of instances/executor threads is higher then the number of files because it is trying to close a HadoopInputFormat which is trying to close RecordReader which was not yet initialized as there is no file for which it should have been opened. The issue is caused when

      public void run(SourceContext<OUT> ctx) throws Exception {
          try {
      ...
              while (isRunning) {
      	    format.open(splitIterator.next());
      ...
          } finally {
      	format.close();
      ...
          }
      

      in file InputFormatSourceFunction.java which calls

      public void close() throws IOException {
          // enforce sequential close() calls
          synchronized (CLOSE_MUTEX) {
              this.recordReader.close();
          }
      }
      

      from HadoopInputFormatBase.java.

      As there is just this one implementation of the close() method it may be enough just to add a null check for the this.recordReader in there.

        Issue Links

          Activity

          Hide
          fhueske Fabian Hueske added a comment -

          Thanks for the bug report Jakub Havlik!
          Let me know if you would like to contribute the fix and I will assign the issue to you.

          Thanks, Fabian

          Show
          fhueske Fabian Hueske added a comment - Thanks for the bug report Jakub Havlik ! Let me know if you would like to contribute the fix and I will assign the issue to you. Thanks, Fabian
          Hide
          havlikj Jakub Havlik added a comment - - edited

          Fabian Hueske Pull request is created: https://github.com/apache/flink/pull/3133/files . Please take a loot at it if it complies to the Flink coding standards.

          Show
          havlikj Jakub Havlik added a comment - - edited Fabian Hueske Pull request is created: https://github.com/apache/flink/pull/3133/files . Please take a loot at it if it complies to the Flink coding standards.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user jakubhavlik opened a pull request:

          https://github.com/apache/flink/pull/3133

          FLINK-5518 [input/output formats] Add null value check.

          Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
          If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
          In addition to going through the list, please provide a meaningful description of your changes.

          • [ ] General
          • The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
          • The pull request addresses only one issue
          • Each commit in the PR has a meaningful commit message (including the JIRA id)
          • [ ] Documentation
          • Documentation has been added for new functionality
          • Old documentation affected by the pull request has been updated
          • JavaDoc for public methods has been added
          • [ ] Tests & Build
          • Functionality added by the pull request is covered by tests
          • `mvn clean verify` has been executed successfully locally or a Travis build has passed

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/jakubhavlik/flink havlikj/flink_5518

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3133.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3133


          commit 984b50fa0d85ab42c47692413d0bad331b363cab
          Author: Jakub Havlik <jakub.havlik@brain-station.com>
          Date: 2017-01-17T07:26:07Z

          FLINK-5518 [input/output formats] Add null value check.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user jakubhavlik opened a pull request: https://github.com/apache/flink/pull/3133 FLINK-5518 [input/output formats] Add null value check. Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide] ( http://flink.apache.org/how-to-contribute.html ). In addition to going through the list, please provide a meaningful description of your changes. [ ] General The pull request references the related JIRA issue (" [FLINK-XXX] Jira title text") The pull request addresses only one issue Each commit in the PR has a meaningful commit message (including the JIRA id) [ ] Documentation Documentation has been added for new functionality Old documentation affected by the pull request has been updated JavaDoc for public methods has been added [ ] Tests & Build Functionality added by the pull request is covered by tests `mvn clean verify` has been executed successfully locally or a Travis build has passed You can merge this pull request into a Git repository by running: $ git pull https://github.com/jakubhavlik/flink havlikj/flink_5518 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3133.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3133 commit 984b50fa0d85ab42c47692413d0bad331b363cab Author: Jakub Havlik <jakub.havlik@brain-station.com> Date: 2017-01-17T07:26:07Z FLINK-5518 [input/output formats] Add null value check.
          Hide
          fhueske Fabian Hueske added a comment -

          Great, thanks Jakub Havlik!
          I'll have a look soon.

          Show
          fhueske Fabian Hueske added a comment - Great, thanks Jakub Havlik ! I'll have a look soon.
          Hide
          havlikj Jakub Havlik added a comment -

          One out of eight builds had failed with error: The job exceeded the maximum time limit for jobs, and has been terminated. It looks as 50 minutes is not enough to run the whole build. What should I do now?

          Show
          havlikj Jakub Havlik added a comment - One out of eight builds had failed with error: The job exceeded the maximum time limit for jobs, and has been terminated. It looks as 50 minutes is not enough to run the whole build. What should I do now?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fhueske commented on the issue:

          https://github.com/apache/flink/pull/3133

          Thanks for the PR @jakubhavlik.

          The fix looks good. Can you add a `testCloseWithoutOpen()` method to `org.apache.flink.api.java.hadoop.mapred.HadoopInputFormatTest` and `org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormatTest` to guard this fix?

          Occasionally builds fail due to timeouts. That's nothing to worry about. I'll run the tests again before merging.

          Thank you.

          Show
          githubbot ASF GitHub Bot added a comment - Github user fhueske commented on the issue: https://github.com/apache/flink/pull/3133 Thanks for the PR @jakubhavlik. The fix looks good. Can you add a `testCloseWithoutOpen()` method to `org.apache.flink.api.java.hadoop.mapred.HadoopInputFormatTest` and `org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormatTest` to guard this fix? Occasionally builds fail due to timeouts. That's nothing to worry about. I'll run the tests again before merging. Thank you.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user jakubhavlik commented on the issue:

          https://github.com/apache/flink/pull/3133

          Tests added and build passed: https://travis-ci.org/jakubhavlik/flink/builds/192653463 .

          Show
          githubbot ASF GitHub Bot added a comment - Github user jakubhavlik commented on the issue: https://github.com/apache/flink/pull/3133 Tests added and build passed: https://travis-ci.org/jakubhavlik/flink/builds/192653463 .
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fhueske commented on the issue:

          https://github.com/apache/flink/pull/3133

          Thanks for the quick update @jakubhavlik.
          +1 to merge.

          Show
          githubbot ASF GitHub Bot added a comment - Github user fhueske commented on the issue: https://github.com/apache/flink/pull/3133 Thanks for the quick update @jakubhavlik. +1 to merge.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fhueske commented on the issue:

          https://github.com/apache/flink/pull/3133

          Merging

          Show
          githubbot ASF GitHub Bot added a comment - Github user fhueske commented on the issue: https://github.com/apache/flink/pull/3133 Merging
          Hide
          fhueske Fabian Hueske added a comment -

          Fixed for 1.3.0 with ece899a9f33b3fe4aa35d6cc66c43053170e631f
          Fixed for 1.2.0 with aad8d253c9f7dd33a63310353cbe132cd4900c6b
          Fixed for 1.1.5 with 214c188d7fffd66a43325eff392f12e8f3421fab

          Show
          fhueske Fabian Hueske added a comment - Fixed for 1.3.0 with ece899a9f33b3fe4aa35d6cc66c43053170e631f Fixed for 1.2.0 with aad8d253c9f7dd33a63310353cbe132cd4900c6b Fixed for 1.1.5 with 214c188d7fffd66a43325eff392f12e8f3421fab

            People

            • Assignee:
              havlikj Jakub Havlik
              Reporter:
              havlikj Jakub Havlik
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development