Hive
  1. Hive
  2. HIVE-3823

Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Trivial Trivial
    • Resolution: Unresolved
    • Affects Version/s: 0.9.0, 0.9.1, 0.10.0
    • Fix Version/s: None
    • Component/s: HBase Handler
    • Labels:
      None

      Description

      In HiveHBaseTableInputFormat.java, the Result objects retrieving has performance issue.

      HiveHBaseTableInputFormat.java
            @Override
            public boolean next(ImmutableBytesWritable rowKey, Result value) throws IOException {
      
              boolean next = false;
      
              try {
                next = recordReader.nextKeyValue();
      
                if (next) {
                  rowKey.set(recordReader.getCurrentValue().getRow());
                  // performance issue here, as the copyWritable
                  // is Serialization - Bytes Copying - Deserialization.
                  Writables.copyWritable(recordReader.getCurrentValue(), value);
                }
              } catch (InterruptedException e) {
                throw new IOException(e);
              }
      
              return next;
            }
      

      In HBASE 0.94.4 & 0.96.0, the Result provides a new method "copyFrom(Result from)", will solve the issue.

      See HBASE-7381

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Cheng Hao
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development