Hive
  1. Hive
  2. HIVE-1975

"insert overwrite directory" Not able to insert data with multi level directory path

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.8.0
    • Component/s: Query Processor
    • Labels:
      None
    • Environment:

      Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).

    • Hadoop Flags:
      Reviewed

      Description

      Below query execution is failed

      Ex:

         insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j;
      
      1. HIVE-1975.1.patch
        0.9 kB
        Chinna Rao Lalam
      2. HIVE-1975.2.patch
        2 kB
        Chinna Rao Lalam
      3. HIVE-1975.3.patch
        11 kB
        Chinna Rao Lalam
      4. HIVE-1975.patch
        0.7 kB
        Chinna Rao Lalam

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        22d 5h 22m 2 Namit Jain 24/Oct/11 18:22
        Open Open Patch Available Patch Available
        239d 16h 53m 3 Chinna Rao Lalam 28/Oct/11 14:51
        Patch Available Patch Available Resolved Resolved
        11h 1 Namit Jain 29/Oct/11 01:51
        Resolved Resolved Closed Closed
        48d 23h 4m 1 Carl Steinbach 16/Dec/11 23:56
        Hide
        Vijay Ratnagiri added a comment -

        Hey Guys,

        I'm using hive 0.11.0 and I just verified that I'm facing this exact problem.

        I first tried to ask hive to create a multilevel path and I got: "return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask"

        When I switched to using a simple one level directory, my query succeded and I could seem my data written out.

        Can anyone else corroborate?

        Thanks!

        Show
        Vijay Ratnagiri added a comment - Hey Guys, I'm using hive 0.11.0 and I just verified that I'm facing this exact problem. I first tried to ask hive to create a multilevel path and I got: "return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask" When I switched to using a simple one level directory, my query succeded and I could seem my data written out. Can anyone else corroborate? Thanks!
        Carl Steinbach made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Carl Steinbach made changes -
        Fix Version/s 0.9.0 [ 12317742 ]
        Carl Steinbach made changes -
        Fix Version/s 0.8.0 [ 12316178 ]
        Carl Steinbach made changes -
        Fix Version/s 0.9.0 [ 12317742 ]
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-h0.21 #1042 (See https://builds.apache.org/job/Hive-trunk-h0.21/1042/)
        HIVE-1975 "insert overwrite directory" Not able to insert data with multi level
        directory path (Chinna Rao Lalam via namit)

        namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1190719
        Files :

        • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
        • /hive/trunk/conf/hive-default.xml
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
        • /hive/trunk/ql/src/test/queries/clientpositive/input45.q
        • /hive/trunk/ql/src/test/results/clientpositive/input45.q.out
        Show
        Hudson added a comment - Integrated in Hive-trunk-h0.21 #1042 (See https://builds.apache.org/job/Hive-trunk-h0.21/1042/ ) HIVE-1975 "insert overwrite directory" Not able to insert data with multi level directory path (Chinna Rao Lalam via namit) namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1190719 Files : /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/conf/hive-default.xml /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java /hive/trunk/ql/src/test/queries/clientpositive/input45.q /hive/trunk/ql/src/test/results/clientpositive/input45.q.out
        Namit Jain made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Resolution Fixed [ 1 ]
        Hide
        Namit Jain added a comment -

        Committed. Thanks Chinna

        Show
        Namit Jain added a comment - Committed. Thanks Chinna
        Hide
        Namit Jain added a comment -

        +1

        Show
        Namit Jain added a comment - +1
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1829/
        -----------------------------------------------------------

        (Updated 2011-10-28 13:53:07.086932)

        Review request for hive and Yongqiang He.

        Changes
        -------

        Addressed Namit & He Yongqiang comments

        Summary
        -------

        If insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        This addresses bug HIVE-1975.
        https://issues.apache.org/jira/browse/HIVE-1975

        Diffs (updated)


        trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1190202
        trunk/conf/hive-default.xml 1190202
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1190202
        trunk/ql/src/test/queries/clientpositive/input45.q PRE-CREATION
        trunk/ql/src/test/results/clientpositive/input45.q.out PRE-CREATION

        Diff: https://reviews.apache.org/r/1829/diff

        Testing
        -------

        Ran all testcases

        Thanks,

        chinna

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1829/ ----------------------------------------------------------- (Updated 2011-10-28 13:53:07.086932) Review request for hive and Yongqiang He. Changes ------- Addressed Namit & He Yongqiang comments Summary ------- If insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. This addresses bug HIVE-1975 . https://issues.apache.org/jira/browse/HIVE-1975 Diffs (updated) trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1190202 trunk/conf/hive-default.xml 1190202 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1190202 trunk/ql/src/test/queries/clientpositive/input45.q PRE-CREATION trunk/ql/src/test/results/clientpositive/input45.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1829/diff Testing ------- Ran all testcases Thanks, chinna
        Hide
        Chinna Rao Lalam added a comment -

        Addressed Namit & He Yongqiang comments

        Show
        Chinna Rao Lalam added a comment - Addressed Namit & He Yongqiang comments
        Chinna Rao Lalam made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chinna Rao Lalam made changes -
        Attachment HIVE-1975.3.patch [ 12501295 ]
        Hide
        He Yongqiang added a comment -

        it is good to add a conf for this, and by default disable it.

        Show
        He Yongqiang added a comment - it is good to add a conf for this, and by default disable it.
        Namit Jain made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Namit Jain added a comment -

        Some minor comments:

        1. Can you add a testcase - use build as a temporary directory to add data ?
        2. Add some comments explaining that the multi-level directory move is not atomic.
        In your example, it is possible that /x/y/z/1/2/3 is created, but /x/y/z/1/2/4.
        It is same as any multi-partition insert (dynamic insert), but would be good to
        explicitly call it out.

        Show
        Namit Jain added a comment - Some minor comments: 1. Can you add a testcase - use build as a temporary directory to add data ? 2. Add some comments explaining that the multi-level directory move is not atomic. In your example, it is possible that /x/y/z/1/2/3 is created, but /x/y/z/1/2/4. It is same as any multi-partition insert (dynamic insert), but would be good to explicitly call it out.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1829/
        -----------------------------------------------------------

        (Updated 2011-10-21 17:09:08.230077)

        Review request for hive and Yongqiang He.

        Changes
        -------

        Reworked on the comments

        Summary
        -------

        If insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        This addresses bug HIVE-1975.
        https://issues.apache.org/jira/browse/HIVE-1975

        Diffs (updated)


        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1186875

        Diff: https://reviews.apache.org/r/1829/diff

        Testing
        -------

        Ran all testcases

        Thanks,

        chinna

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1829/ ----------------------------------------------------------- (Updated 2011-10-21 17:09:08.230077) Review request for hive and Yongqiang He. Changes ------- Reworked on the comments Summary ------- If insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. This addresses bug HIVE-1975 . https://issues.apache.org/jira/browse/HIVE-1975 Diffs (updated) trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1186875 Diff: https://reviews.apache.org/r/1829/diff Testing ------- Ran all testcases Thanks, chinna
        Chinna Rao Lalam made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chinna Rao Lalam made changes -
        Attachment HIVE-1975.2.patch [ 12500209 ]
        Chinna Rao Lalam made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1829/
        -----------------------------------------------------------

        Review request for hive and Yongqiang He.

        Summary
        -------

        If insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        This addresses bug HIVE-1975.
        https://issues.apache.org/jira/browse/HIVE-1975

        Diffs


        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1170044

        Diff: https://reviews.apache.org/r/1829/diff

        Testing
        -------

        Ran all testcases

        Thanks,

        chinna

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1829/ ----------------------------------------------------------- Review request for hive and Yongqiang He. Summary ------- If insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. This addresses bug HIVE-1975 . https://issues.apache.org/jira/browse/HIVE-1975 Diffs trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1170044 Diff: https://reviews.apache.org/r/1829/diff Testing ------- Ran all testcases Thanks, chinna
        Chinna Rao Lalam made changes -
        Attachment HIVE-1975.1.patch [ 12494276 ]
        Hide
        Chinna Rao Lalam added a comment -

        Use case is if we want to insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) this is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        >> But users misspell the dir name, they may not want the dirs created.Or worse, the data got loaded to some other place they not noticed.

        This one will happen in the normal scenario also what ever the location user specifies it will move the data to that location.

        But as u said if the rename operation fails it should delete the created directory. I will update the patch with this case.

        Show
        Chinna Rao Lalam added a comment - Use case is if we want to insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) this is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. >> But users misspell the dir name, they may not want the dirs created.Or worse, the data got loaded to some other place they not noticed. This one will happen in the normal scenario also what ever the location user specifies it will move the data to that location. But as u said if the rename operation fails it should delete the created directory. I will update the patch with this case.
        Hide
        He Yongqiang added a comment -

        what's the use case here? the user can always first create the parent dir. But users misspell the dir name, they may not want the dirs created. Or worse, the data got loaded to some other place they not noticed.

        Show
        He Yongqiang added a comment - what's the use case here? the user can always first create the parent dir. But users misspell the dir name, they may not want the dirs created. Or worse, the data got loaded to some other place they not noticed.
        Chinna Rao Lalam made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chinna Rao Lalam made changes -
        Field Original Value New Value
        Attachment HIVE-1975.patch [ 12491759 ]
        Hide
        Edward Capriolo added a comment -

        Are you asking if hive can export a partitioned table to a flat directory? please include more details in your report. Such as describe extended dept_j;

        Show
        Edward Capriolo added a comment - Are you asking if hive can export a partitioned table to a flat directory? please include more details in your report. Such as describe extended dept_j;
        Hide
        Ovidiu added a comment -

        Any update on this particular issue?

        I am trying to do the same but using s3 storage and it seems hive can only create one new directory and not a full structure. Is this a limitation?

        Show
        Ovidiu added a comment - Any update on this particular issue? I am trying to do the same but using s3 storage and it seems hive can only create one new directory and not a full structure. Is this a limitation?
        Chinna Rao Lalam created issue -

          People

          • Assignee:
            Chinna Rao Lalam
            Reporter:
            Chinna Rao Lalam
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development