Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10915

ORC fails to read table with a 38Gb ORC file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 1.3.0
    • None
    • File Formats
    • None

    Description

      hive>  set mapreduce.input.fileinputformat.split.maxsize=1000000000000;
      hive> set  mapreduce.input.fileinputformat.split.maxsize=1000000000000;
      hive> alter table lineitem concatenate;
      ..
      hive> dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem;
      Found 12 items
      -rwxr-xr-x   3 gopal supergroup 41368976599 2015-06-03 15:49 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000000_0
      -rwxr-xr-x   3 gopal supergroup 36226719673 2015-06-03 15:48 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000001_0
      -rwxr-xr-x   3 gopal supergroup 27544042018 2015-06-03 15:50 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000002_0
      -rwxr-xr-x   3 gopal supergroup 23147063608 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000003_0
      -rwxr-xr-x   3 gopal supergroup 21079035936 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000004_0
      -rwxr-xr-x   3 gopal supergroup 13813961419 2015-06-03 15:43 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000005_0
      -rwxr-xr-x   3 gopal supergroup  8155299977 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000006_0
      -rwxr-xr-x   3 gopal supergroup  6264478613 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000007_0
      -rwxr-xr-x   3 gopal supergroup  4653393054 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000008_0
      -rwxr-xr-x   3 gopal supergroup  3621672928 2015-06-03 15:39 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000009_0
      -rwxr-xr-x   3 gopal supergroup  1460919310 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000010_0
      -rwxr-xr-x   3 gopal supergroup   485129789 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/000011_0
      

      Errors without PPD

      Suspicious offsets in the stream information -

      Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 to 36845
              at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56)
              at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302)
              at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346)
              at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582)
              at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070)
              ... 25 more
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: