Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6748

FileSinkOperator needs to cleanup held references for container reuse

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0
    • Tez
    • None
    • FileSinkOperator cleanliness of references on closeOp/initializeOp

    Description

      The current implementation of FileSinkOperator runs into trouble when reusing the same query pipeline aggressively with container reuse.

      This is due to a prevFSP writer which is left referenced after closeOp() and which is not reset even for initializeOp().

      014-03-25 14:46:31,744 FATAL [main] org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor: org.apache.hadoop.hive.ql.metadata.HiveException: java.nio.channels.ClosedChannelException
              at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:170)
              at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:758)
              at org.apache.hadoop.hive.ql.exec.FileSinkOperator.startGroup(FileSinkOperator.java:833)
              at org.apache.hadoop.hive.ql.exec.Operator.defaultStartGroup(Operator.java:497)
              at org.apache.hadoop.hive.ql.exec.Operator.startGroup(Operator.java:520)
              at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processKeyValues(ReduceRecordProcessor.java:296)
              at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:223)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:159)
              at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:306)
              at org.apache.hadoop.mapred.YarnTezDagChild$4.run(YarnTezDagChild.java:549)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
              at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:538)
      Caused by: java.nio.channels.ClosedChannelException
              at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1526)
              at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98)
              at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
              at java.io.DataOutputStream.write(DataOutputStream.java:107)
              at org.apache.hadoop.hive.ql.io.orc.WriterImpl$DirectStream.output(WriterImpl.java:316)
              at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:242)
              at org.apache.hadoop.hive.ql.io.orc.WriterImpl.writeMetadata(WriterImpl.java:1923)
              at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2017)
              at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:98)
              at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:167)
              ... 13 more
      

      Attachments

        1. HIVE-6748.1.patch
          0.9 kB
          Gopal Vijayaraghavan

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gopalv Gopal Vijayaraghavan Assign to me
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment