Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11033

BloomFilter index is not honored by ORC reader

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.1
    • Component/s: None
    • Labels:
      None

      Description

      There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class which caused the bloom filter index saved in the ORC file not being used. The root cause is the bloomFilterIndices variable defined in the SargApplier class superseded the one defined in its parent class. Therefore, in the ReaderImpl.pickRowGroups()

        protected boolean[] pickRowGroups() throws IOException {
          // if we don't have a sarg or indexes, we read everything
          if (sargApp == null) {
            return null;
          }
          readRowIndex(currentStripe, included, sargApp.sargColumns);
          return sargApp.pickRowGroups(stripes.get(currentStripe), indexes);
        }
      

      The bloomFilterIndices populated by readRowIndex() is not picked up by sargApp object. One solution is to make SargApplier.bloomFilterIndices a reference to its parent counterpart.

      18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original
      174d173
      <     bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
      178c177
      <           sarg, options.getColumnNames(), strideRate, types, included.length, bloomFilterIndices);
      ---
      >           sarg, options.getColumnNames(), strideRate, types, included.length);
      204a204
      >     bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
      673c673
      <         List<OrcProto.Type> types, int includedCount, OrcProto.BloomFilterIndex[] bloomFilterIndices) {
      ---
      >         List<OrcProto.Type> types, int includedCount) {
      677c677
      <       this.bloomFilterIndices = bloomFilterIndices;
      ---
      >       bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
      

        Attachments

        1. HIVE-11033.2.patch
          4 kB
          Prasanth Jayachandran
        2. HIVE-11033.patch
          3 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                allanyan Allan Yan
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: