Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8719

LoadSemanticAnalyzer ignores previous partition location if inserting into partition that already exists

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.14.0
    • None
    • Query Processor
    • None

    Description

      LOAD DATA INSERT INTO seems to be broken currently for partitions that do not use hive's native directory structure naming scheme, thus ignoring any location previously set by an ALTER TABLE ADD PARTITION ... LOCATION ... command.

      Here is a simple reproducer:

      echo 1 > /tmp/data1.txt
      hive -e "create external table testpart(id int) partitioned by (date string) location '/tmp/testpart';"
      hive -e "alter table testpart add partition(date='2014-09-16')  location '/tmp/testpart/20140916';"
      hive -e "describe formatted testpart partition(date='2014-09-16') ;" | egrep '/tmp/testpart/(date=.?)?2014-?09-?16' > /tmp/a
      hive -e "load data local inpath '/tmp/data1.txt' into table testpart partition(date='2014-09-16');"
      hive -e "describe formatted testpart partition(date='2014-09-16') ;" | egrep '/tmp/testpart/(date=.?)?2014-?09-?16' > /tmp/b
      diff /tmp/a /tmp/b
      hadoop fs -ls /tmp/testpart/
      

      Basically, what happens is that after the ALTER TABLE ADD PARTITION ... LOCATION, the location is "/tmp/testpart/20140916". After the LOAD DATA has run, the partition location becomes "/tmp/testpart/date=2014-09-16/". Any data previously present in the other location will then be ignored as well.

      Attachments

        1. HIVE-8719.patch
          2 kB
          Sushanth Sowmyan

        Issue Links

          Activity

            People

              sushanth Sushanth Sowmyan
              sushanth Sushanth Sowmyan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: