Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-6359

KeyValue may return incorrect values after readFields()

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When the same KeyValue object is used multiple times for deserialization using readFields, some methods may return incorrect values. Here is a sequence of operations that will reproduce the problem:

      1. A KeyValue is created whose key has length 10. The private field keyLength is initialized to 0.
      2. KeyValue.getKeyLength() is called. This reads the key length 10 from the backing array and caches it in keyLength.
      3. KeyValue.readFields() is called to deserialize a new value. The keyLength field is not cleared and keeps its value of 10, even though this value is probably incorrect.
      4. If getKeyLength() is called, the value 10 will be returned.

      For example, in a reducer with Iterable<KeyValue>, all values after the first one from the iterable are likely to return incorrect values from getKeyLength().

      The solution is to clear all memoized values in KeyValue.readFields(). I'll write a patch for this soon.

        Attachments

          Activity

            People

            • Assignee:
              dave_revell Dave Revell
              Reporter:
              dave_revell Dave Revell
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: