Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-435

Ability to read stripes that are greater than 2GB

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.4, 1.4.4, 1.5.3, 1.6.0
    • Fix Version/s: 1.5.4, 1.6.0
    • Component/s: Reader
    • Labels:
      None

      Description

      ORC reader fails with NegativeArraySizeException if the stripe size is >2GB. Even though default stripe size is 64MB there are cases where stripe size will reach >2GB even before memory manager can kick in to check memory size. Say if we are inserting 500KB strings (mostly unique) by the time we reach 5000 rows stripe size is already over 2GB. Reader will have to chunk the disk range reads for such cases instead of reading the stripe as whole blob. 

      Exception thrown when reading such files

      2018-10-12 21:43:58,833 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NegativeArraySizeException
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:272)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readPartialDataStreams(RecordReaderImpl.java:1007)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:835)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1029)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1062)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1085)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                prasanth_j Prasanth Jayachandran
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: