Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22077

Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 1.1.1, 2.3.4, 4.0.0
    • None
    • Hive
    • None

    Description

      Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata.
      Steps to reproduce this issue :
      ------------------------------------------------
      1. Create a managed table :
      ------------------------------------------------

       CREATE TABLE `test`(                               
         `id` string)                                     
       PARTITIONED BY (                                   
         `dayno` string)                                  
       ROW FORMAT SERDE                                   
         'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      
       STORED AS INPUTFORMAT                              
         'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
       OUTPUTFORMAT                                       
         'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
       LOCATION                                           
         'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' 
       TBLPROPERTIES (                                    
         'transient_lastDdlTime'='1564731656')   
      

      ------------------------------------------------
      2. Create partition's directory and put some data in it
      ------------------------------------------------

      hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
      hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
      

      ------------------------------------------------
      3. Insert overwrite partition dayno=20190802
      ------------------------------------------------

      INSERT OVERWRITE TABLE test PARTITION(dayno='20190802')
      SELECT "some value";
      

      ------------------------------------------------
      4. We could see the test.data under partition directory is not deleted.
      ------------------------------------------------

      Attachments

        1. HIVE-22077.patch.1
          2 kB
          Hui An

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Bone An Hui An Assign to me
            Bone An Hui An

            Dates

              Created:
              Updated:

              Slack

                Issue deployment