Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23399

Register a task completion listener first for OrcColumnarBatchReader

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      This is related with SPARK-23390.

      Currently, there was a opened file leak for OrcColumnarBatchReader.

      [info] - Enabling/disabling ignoreMissingFiles using orc (648 milliseconds)
      15:55:58.673 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 61.0 (TID 85, localhost, executor driver): TaskKilled (Stage cancelled)
      15:55:58.674 WARN org.apache.spark.DebugFilesystem: Leaked filesystem connection created at:
      java.lang.Throwable
      	at org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36)
      	at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70)
      	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
      	at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.open(RecordReaderUtils.java:173)
      	at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:254)
      	at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:633)
      	at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initialize(OrcColumnarBatchReader.java:138)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dongjoon Dongjoon Hyun
                Reporter:
                dongjoon Dongjoon Hyun
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: