Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10722

external table creation with msck in Hive can create unusable partition

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.14.1, 1.0.0
    • Fix Version/s: 1.3.0, 2.0.0
    • Component/s: Metastore
    • Labels:

      Description

      There can be directories in HDFS containing unprintable characters; when doing hadoop fs -ls, these characters are not even visible, and can only be seen for example if output is piped thru od.
      When these are loaded via msck, they are stored in e.g. mysql as "?" (literal question mark, findable via LIKE '%?%' in db) and show accordingly in Hive.
      However, datanucleus appears to encode it as %3F; this causes the partition to be unusable - it cannot be dropped, and other operations like drop table get stuck (didn't investigate in detail why; drop table got unstuck as soon as the partition was removed from metastore).

      We should probably have a 2-way option for such cases - error out on load (default), or convert to '?'/drop such characters (and have partition that actually works, too).

      We should also check if partitions with '?' inserted explicitly work at all with datanucleus.

        Attachments

        1. HIVE-10722.patch
          22 kB
          Sergey Shelukhin
        2. HIVE-10722.01.patch
          24 kB
          Sergey Shelukhin

          Activity

            People

            • Assignee:
              sershe Sergey Shelukhin
              Reporter:
              sershe Sergey Shelukhin
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: