Hive
  1. Hive
  2. HIVE-1975

"insert overwrite directory" Not able to insert data with multi level directory path

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.8.0
    • Component/s: Query Processor
    • Labels:
      None
    • Environment:

      Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).

    • Hadoop Flags:
      Reviewed

      Description

      Below query execution is failed

      Ex:

         insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j;
      
      1. HIVE-1975.patch
        0.7 kB
        Chinna Rao Lalam
      2. HIVE-1975.3.patch
        11 kB
        Chinna Rao Lalam
      3. HIVE-1975.2.patch
        2 kB
        Chinna Rao Lalam
      4. HIVE-1975.1.patch
        0.9 kB
        Chinna Rao Lalam

        Activity

        Hide
        Ovidiu added a comment -

        Any update on this particular issue?

        I am trying to do the same but using s3 storage and it seems hive can only create one new directory and not a full structure. Is this a limitation?

        Show
        Ovidiu added a comment - Any update on this particular issue? I am trying to do the same but using s3 storage and it seems hive can only create one new directory and not a full structure. Is this a limitation?
        Hide
        Edward Capriolo added a comment -

        Are you asking if hive can export a partitioned table to a flat directory? please include more details in your report. Such as describe extended dept_j;

        Show
        Edward Capriolo added a comment - Are you asking if hive can export a partitioned table to a flat directory? please include more details in your report. Such as describe extended dept_j;
        Hide
        He Yongqiang added a comment -

        what's the use case here? the user can always first create the parent dir. But users misspell the dir name, they may not want the dirs created. Or worse, the data got loaded to some other place they not noticed.

        Show
        He Yongqiang added a comment - what's the use case here? the user can always first create the parent dir. But users misspell the dir name, they may not want the dirs created. Or worse, the data got loaded to some other place they not noticed.
        Hide
        Chinna Rao Lalam added a comment -

        Use case is if we want to insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) this is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        >> But users misspell the dir name, they may not want the dirs created.Or worse, the data got loaded to some other place they not noticed.

        This one will happen in the normal scenario also what ever the location user specifies it will move the data to that location.

        But as u said if the rename operation fails it should delete the created directory. I will update the patch with this case.

        Show
        Chinna Rao Lalam added a comment - Use case is if we want to insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) this is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. >> But users misspell the dir name, they may not want the dirs created.Or worse, the data got loaded to some other place they not noticed. This one will happen in the normal scenario also what ever the location user specifies it will move the data to that location. But as u said if the rename operation fails it should delete the created directory. I will update the patch with this case.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1829/
        -----------------------------------------------------------

        Review request for hive and Yongqiang He.

        Summary
        -------

        If insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        This addresses bug HIVE-1975.
        https://issues.apache.org/jira/browse/HIVE-1975

        Diffs


        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1170044

        Diff: https://reviews.apache.org/r/1829/diff

        Testing
        -------

        Ran all testcases

        Thanks,

        chinna

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1829/ ----------------------------------------------------------- Review request for hive and Yongqiang He. Summary ------- If insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. This addresses bug HIVE-1975 . https://issues.apache.org/jira/browse/HIVE-1975 Diffs trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1170044 Diff: https://reviews.apache.org/r/1829/diff Testing ------- Ran all testcases Thanks, chinna
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1829/
        -----------------------------------------------------------

        (Updated 2011-10-21 17:09:08.230077)

        Review request for hive and Yongqiang He.

        Changes
        -------

        Reworked on the comments

        Summary
        -------

        If insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        This addresses bug HIVE-1975.
        https://issues.apache.org/jira/browse/HIVE-1975

        Diffs (updated)


        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1186875

        Diff: https://reviews.apache.org/r/1829/diff

        Testing
        -------

        Ran all testcases

        Thanks,

        chinna

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1829/ ----------------------------------------------------------- (Updated 2011-10-21 17:09:08.230077) Review request for hive and Yongqiang He. Changes ------- Reworked on the comments Summary ------- If insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. This addresses bug HIVE-1975 . https://issues.apache.org/jira/browse/HIVE-1975 Diffs (updated) trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1186875 Diff: https://reviews.apache.org/r/1829/diff Testing ------- Ran all testcases Thanks, chinna
        Hide
        Namit Jain added a comment -

        Some minor comments:

        1. Can you add a testcase - use build as a temporary directory to add data ?
        2. Add some comments explaining that the multi-level directory move is not atomic.
        In your example, it is possible that /x/y/z/1/2/3 is created, but /x/y/z/1/2/4.
        It is same as any multi-partition insert (dynamic insert), but would be good to
        explicitly call it out.

        Show
        Namit Jain added a comment - Some minor comments: 1. Can you add a testcase - use build as a temporary directory to add data ? 2. Add some comments explaining that the multi-level directory move is not atomic. In your example, it is possible that /x/y/z/1/2/3 is created, but /x/y/z/1/2/4. It is same as any multi-partition insert (dynamic insert), but would be good to explicitly call it out.
        Hide
        He Yongqiang added a comment -

        it is good to add a conf for this, and by default disable it.

        Show
        He Yongqiang added a comment - it is good to add a conf for this, and by default disable it.
        Hide
        Chinna Rao Lalam added a comment -

        Addressed Namit & He Yongqiang comments

        Show
        Chinna Rao Lalam added a comment - Addressed Namit & He Yongqiang comments
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1829/
        -----------------------------------------------------------

        (Updated 2011-10-28 13:53:07.086932)

        Review request for hive and Yongqiang He.

        Changes
        -------

        Addressed Namit & He Yongqiang comments

        Summary
        -------

        If insert into multilevel directory like '/HIVEFT25686/chinna/'
        (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path.

        This addresses bug HIVE-1975.
        https://issues.apache.org/jira/browse/HIVE-1975

        Diffs (updated)


        trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1190202
        trunk/conf/hive-default.xml 1190202
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1190202
        trunk/ql/src/test/queries/clientpositive/input45.q PRE-CREATION
        trunk/ql/src/test/results/clientpositive/input45.q.out PRE-CREATION

        Diff: https://reviews.apache.org/r/1829/diff

        Testing
        -------

        Ran all testcases

        Thanks,

        chinna

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1829/ ----------------------------------------------------------- (Updated 2011-10-28 13:53:07.086932) Review request for hive and Yongqiang He. Changes ------- Addressed Namit & He Yongqiang comments Summary ------- If insert into multilevel directory like '/HIVEFT25686/chinna/' (insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j) is failing because "fs.rename(sourcePath, targetPath)" is failing rename to multilevel directories, so first created the target path. This addresses bug HIVE-1975 . https://issues.apache.org/jira/browse/HIVE-1975 Diffs (updated) trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1190202 trunk/conf/hive-default.xml 1190202 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1190202 trunk/ql/src/test/queries/clientpositive/input45.q PRE-CREATION trunk/ql/src/test/results/clientpositive/input45.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1829/diff Testing ------- Ran all testcases Thanks, chinna
        Hide
        Namit Jain added a comment -

        +1

        Show
        Namit Jain added a comment - +1
        Hide
        Namit Jain added a comment -

        Committed. Thanks Chinna

        Show
        Namit Jain added a comment - Committed. Thanks Chinna
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-h0.21 #1042 (See https://builds.apache.org/job/Hive-trunk-h0.21/1042/)
        HIVE-1975 "insert overwrite directory" Not able to insert data with multi level
        directory path (Chinna Rao Lalam via namit)

        namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1190719
        Files :

        • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
        • /hive/trunk/conf/hive-default.xml
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
        • /hive/trunk/ql/src/test/queries/clientpositive/input45.q
        • /hive/trunk/ql/src/test/results/clientpositive/input45.q.out
        Show
        Hudson added a comment - Integrated in Hive-trunk-h0.21 #1042 (See https://builds.apache.org/job/Hive-trunk-h0.21/1042/ ) HIVE-1975 "insert overwrite directory" Not able to insert data with multi level directory path (Chinna Rao Lalam via namit) namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1190719 Files : /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/conf/hive-default.xml /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java /hive/trunk/ql/src/test/queries/clientpositive/input45.q /hive/trunk/ql/src/test/results/clientpositive/input45.q.out
        Hide
        Vijay Ratnagiri added a comment -

        Hey Guys,

        I'm using hive 0.11.0 and I just verified that I'm facing this exact problem.

        I first tried to ask hive to create a multilevel path and I got: "return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask"

        When I switched to using a simple one level directory, my query succeded and I could seem my data written out.

        Can anyone else corroborate?

        Thanks!

        Show
        Vijay Ratnagiri added a comment - Hey Guys, I'm using hive 0.11.0 and I just verified that I'm facing this exact problem. I first tried to ask hive to create a multilevel path and I got: "return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask" When I switched to using a simple one level directory, my query succeded and I could seem my data written out. Can anyone else corroborate? Thanks!

          People

          • Assignee:
            Chinna Rao Lalam
            Reporter:
            Chinna Rao Lalam
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development