Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21796

ArrayWritableObjectInspector.equals can take O(2^nesting_depth) time

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • None
    • None

    Description

      The issue came up during an Impala test when we tried to run it with Hive 3.1. The a query hanged: it tried to insert 1 row from a Parquet file with a 99 level nested column to a similar Orc file, and spent its time in ArrayWritableObjectInspector.equals() according to jstack.

      The problem seems to be that equals() calls both fields.equals(that.fields) and fieldsByName.equals(that.fieldsByName), and both try to compare all nested fields recursively, which leads to the O(2^nesting_depth) complexity.

      The commit that introduced this behavior:
      https://github.com/apache/hive/commit/98a25f2d831ab27e174bc99792047eaa8ec08b82#diff-8c6363e90d442f239bc252a104f1bfed

      The Impala test:
      https://github.com/apache/impala/blob/9ee4a5e1940afa47227a92e0f6fba6d4c9909f63/tests/query_test/test_nested_types.py#L612

      Attachments

        1. HIVE-21796.1.patch
          5 kB
          Csaba Ringhofer
        2. HIVE-21796.2.patch
          5 kB
          Csaba Ringhofer
        3. HIVE-21796.3.patch
          5 kB
          Zoltan Matyus
        4. HIVE-21796.4.patch
          5 kB
          Zoltan Matyus
        5. HIVE-21796.5.patch
          5 kB
          Zoltan Matyus
        6. HIVE-21796.patch
          1 kB
          Csaba Ringhofer
        7. jstack.txt
          46 kB
          Csaba Ringhofer

        Issue Links

          Activity

            People

              zmatyus Zoltan Matyus
              csringhofer Csaba Ringhofer
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: