Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4322

SkewedInfo in Metastore Thrift API cannot be deserialized in Python

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.11.0
    • Fix Version/s: 0.12.0
    • Component/s: Metastore, Thrift API
    • Labels:
      None

      Description

      The Thrift-generated Python code that deserializes Thrift objects fails whenever a complex type is used as a map key, because by default mutable Python objects such as lists do not have a hash function. See https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.

      The SkewedInfo struct contains a map which uses a list as a key, breaking the Python Thrift interface. It is not possible to specify the mapping from Thrift types to Python types, or otherwise we could map Thrift lists to Python tuples. Instead, the proposed workaround wraps the list inside a new struct. This alone does not accomplish anything, but allows Python clients to define a hash function for the struct class, e.g.:

      def f(object):
      return hash(tuple(object.skewedValueList))

      SkewedValueList._hash_ = f

      In practice a more efficient hash might be defined that does not involve copying the list. The advantage of wrapping the list inside a struct is that the client does not have to define the hash on the list itself, which would change the behaviour of lists everywhere else in the code.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sxyuan Samuel Yuan
                Reporter:
                sxyuan Samuel Yuan
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: