Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12537

RLEv2 doesn't seem to work

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.14.0, 1.0.1, 1.1.1, 1.2.1, 1.3.0, 2.0.0
    • 2.0.0
    • File Formats, ORC

    Description

      Perhaps I'm doing something wrong or is actually working as expected.

      Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file.
      Code and FileDump attached.

      ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector(
      		Integer.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
      Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), 
      			OrcFile.writerOptions(new Configuration())
      				.compress(CompressionKind.NONE)
      				.inspector(inspector)
      				.encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION)
      				.version(OrcFile.Version.V_0_12)
      		);
      
      for (int i = 0; i < 1000000; ++i) {
      	w.addRow(123);
      }
      w.close();
      

      Attachments

        1. HIVE-11537-branch-1.patch
          63 kB
          Prasanth Jayachandran
        2. HIVE-12537.1.patch
          60 kB
          Prasanth Jayachandran
        3. HIVE-12537.2.patch
          117 kB
          Prasanth Jayachandran
        4. HIVE-12537.3.patch
          118 kB
          Prasanth Jayachandran
        5. HIVE-12537.4.patch
          126 kB
          Prasanth Jayachandran
        6. Main.java
          1.0 kB
          Bogdan Raducanu
        7. orcdump.txt
          0.5 kB
          Bogdan Raducanu

        Activity

          People

            prasanth_j Prasanth Jayachandran
            bograd Bogdan Raducanu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: