Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10915

ORC fails to read table with a 38Gb ORC file

    Details

    • Type: Bug
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.3.0
    • Fix Version/s: None
    • Component/s: File Formats
    • Labels:
      None

      Description

      hive>  set mapreduce.input.fileinputformat.split.maxsize=1000000000000;
      hive> set  mapreduce.input.fileinputformat.split.maxsize=1000000000000;
      hive> alter table lineitem concatenate;
      ..
      hive> dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem;
      Found 12 items
      -rwxr-xr-x   3 gopal supergroup 41368976599 2015-06-03 15:49 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000000_0
      -rwxr-xr-x   3 gopal supergroup 36226719673 2015-06-03 15:48 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000001_0
      -rwxr-xr-x   3 gopal supergroup 27544042018 2015-06-03 15:50 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000002_0
      -rwxr-xr-x   3 gopal supergroup 23147063608 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000003_0
      -rwxr-xr-x   3 gopal supergroup 21079035936 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000004_0
      -rwxr-xr-x   3 gopal supergroup 13813961419 2015-06-03 15:43 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000005_0
      -rwxr-xr-x   3 gopal supergroup  8155299977 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000006_0
      -rwxr-xr-x   3 gopal supergroup  6264478613 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000007_0
      -rwxr-xr-x   3 gopal supergroup  4653393054 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000008_0
      -rwxr-xr-x   3 gopal supergroup  3621672928 2015-06-03 15:39 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000009_0
      -rwxr-xr-x   3 gopal supergroup  1460919310 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000010_0
      -rwxr-xr-x   3 gopal supergroup   485129789 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000011_0
      

      Errors without PPD

      Suspicious offsets in the stream information -

      Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 to 36845
              at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56)
              at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302)
              at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346)
              at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582)
              at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070)
              ... 25 more
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                gopalv Gopal V
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: