Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22077

Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.1.1, 2.3.4, 4.0.0
    • Fix Version/s: None
    • Component/s: Hive
    • Labels:
      None

      Description

      Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata.
      Steps to reproduce this issue :
      ------------------------------------------------
      1. Create a managed table :
      ------------------------------------------------

       CREATE TABLE `test`(                               
         `id` string)                                     
       PARTITIONED BY (                                   
         `dayno` string)                                  
       ROW FORMAT SERDE                                   
         'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      
       STORED AS INPUTFORMAT                              
         'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
       OUTPUTFORMAT                                       
         'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
       LOCATION                                           
         'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' 
       TBLPROPERTIES (                                    
         'transient_lastDdlTime'='1564731656')   
      

      ------------------------------------------------
      2. Create partition's directory and put some data in it
      ------------------------------------------------

      hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
      hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
      

      ------------------------------------------------
      3. Insert overwrite partition dayno=20190802
      ------------------------------------------------

      INSERT OVERWRITE TABLE test PARTITION(dayno='20190802')
      SELECT "some value";
      

      ------------------------------------------------
      4. We could see the test.data under partition directory is not deleted.
      ------------------------------------------------

        Attachments

        1. HIVE-22077.patch.1
          2 kB
          Hui An

          Activity

            People

            • Assignee:
              Bone An Hui An
              Reporter:
              Bone An Hui An
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated: