Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3823

Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Trivial
    • Resolution: Unresolved
    • 0.9.0, 0.9.1, 0.10.0
    • None
    • HBase Handler
    • None

    Description

      In HiveHBaseTableInputFormat.java, the Result objects retrieving has performance issue.

      HiveHBaseTableInputFormat.java
            @Override
            public boolean next(ImmutableBytesWritable rowKey, Result value) throws IOException {
      
              boolean next = false;
      
              try {
                next = recordReader.nextKeyValue();
      
                if (next) {
                  rowKey.set(recordReader.getCurrentValue().getRow());
                  // performance issue here, as the copyWritable
                  // is Serialization - Bytes Copying - Deserialization.
                  Writables.copyWritable(recordReader.getCurrentValue(), value);
                }
              } catch (InterruptedException e) {
                throw new IOException(e);
              }
      
              return next;
            }
      

      In HBASE 0.94.4 & 0.96.0, the Result provides a new method "copyFrom(Result from)", will solve the issue.

      See HBASE-7381

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              chenghao Cheng Hao
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: