Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16299

MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.2.0
    • 3.0.0
    • Metastore
    • None

    Description

      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java

      static String getPartitionName(Path tablePath, Path partitionPath, Set<String> partCols)

      ------------------------------------------------------------------------------------

      MSCK REPAIR validates that any sub-directory is in the format col=val and that there is indeed a partition column named "col".
      However, there is no validation of the partition column location and as a result false partitions are being created and so are directories that match those partitions.

      e.g. 1

      hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
      hive> create external table t (i int) partitioned by (a int,b int,c int) ;
      OK
      hive> msck repair table t;
      OK
      Partitions not in metastore: t:a=1/a=2/a=3/b=4/c=5
      Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
      Time taken: 0.563 seconds, Fetched: 2 row(s)
      hive> show partitions t;
      OK
      a=3/b=4/c=5
      hive> dfs -ls -R /user/hive/warehouse/t;
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=1
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2/a=3
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2/a=3/b=4
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
      drwxrwxrwx - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=3
      drwxrwxrwx - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=3/b=4
      drwxrwxrwx - cloudera supergroup 0 2017-03-26 13:07 /user/hive/warehouse/t/a=3/b=4/c=5

      e.g. 2
      hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
      hive> create external table t (i int) partitioned by (a int,b int,c int);
      OK
      hive> msck repair table t;
      OK
      Partitions not in metastore: t:c=3/b=2/a=1
      Repair: Added partition to metastore t:c=3/b=2/a=1
      Time taken: 0.512 seconds, Fetched: 2 row(s)
      hive> show partitions t;
      OK
      a=1/b=2/c=3
      hive> dfs -ls -R /user/hive/warehouse/t;
      drwxrwxrwx - cloudera supergroup 0 2017-03-26 13:13 /user/hive/warehouse/t/a=1
      drwxrwxrwx - cloudera supergroup 0 2017-03-26 13:13 /user/hive/warehouse/t/a=1/b=2
      drwxrwxrwx - cloudera supergroup 0 2017-03-26 13:13 /user/hive/warehouse/t/a=1/b=2/c=3
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:12 /user/hive/warehouse/t/c=3
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:12 /user/hive/warehouse/t/c=3/b=2
      drwxr-xr-x - cloudera supergroup 0 2017-03-26 13:12 /user/hive/warehouse/t/c=3/b=2/a=1

      Attachments

        1. HIVE-16299.01.patch
          9 kB
          Vihang Karajgaonkar
        2. HIVE-16299.02.patch
          22 kB
          Vihang Karajgaonkar
        3. HIVE-16299.03.patch
          25 kB
          Vihang Karajgaonkar
        4. HIVE-16299.04.patch
          25 kB
          Vihang Karajgaonkar

        Issue Links

          Activity

            People

              vihangk1 Vihang Karajgaonkar
              dmarkovitz Dudu Markovitz
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: