Hive
  1. Hive
  2. HIVE-4620

MR temp directory conflicts in case of parallel execution mode

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.11.0
    • Fix Version/s: 0.12.0
    • Component/s: Query Processor
    • Labels:
      None

      Description

      In parallel query execution mode, all the parallel running task ends up sharing the same temp/scratch directory. This could lead to file conflicts and temp files getting deleted before the job completion.

      1. HIVE-4620-3.patch
        3 kB
        Prasad Mujumdar
      2. HIVE-4620-2.patch
        3 kB
        Prasad Mujumdar
      3. HIVE-4620-1.patch
        3 kB
        Prasad Mujumdar

        Activity

        Hide
        Prasad Mujumdar added a comment -
        Show
        Prasad Mujumdar added a comment - Review request on https://reviews.apache.org/r/11464/
        Hide
        Navis added a comment -

        Looks good to me. running test

        Show
        Navis added a comment - Looks good to me. running test
        Hide
        Navis added a comment -

        Failing one test but it seemed not related to this. I'll check on that first.

        Show
        Navis added a comment - Failing one test but it seemed not related to this. I'll check on that first.
        Hide
        Navis added a comment -

        Return value of TaskRunner#getTaskID() is not a task id but a task runner id, which can be a little confusing. Could you address that? Thanks.

        Show
        Navis added a comment - Return value of TaskRunner#getTaskID() is not a task id but a task runner id, which can be a little confusing. Could you address that? Thanks.
        Hide
        Prasad Mujumdar added a comment -

        Updated patch per review comments.

        Show
        Prasad Mujumdar added a comment - Updated patch per review comments.
        Hide
        Navis added a comment -

        It would be good to make a phabricator or review-board entry.

        in TaskRunner.run()

        taskRunnerID.set(taskCounter.incrementAndGet());
        

        Is it necessary? If it is, is that should be called in runSequential() rather than run()?

        Show
        Navis added a comment - It would be good to make a phabricator or review-board entry. in TaskRunner.run() taskRunnerID.set(taskCounter.incrementAndGet()); Is it necessary? If it is, is that should be called in runSequential() rather than run()?
        Hide
        Prasad Mujumdar added a comment -

        Navis Thanks for the comments.
        The original review request on https://reviews.apache.org/r/11464/ is updated with the new patch.

        Show
        Prasad Mujumdar added a comment - Navis Thanks for the comments. The original review request on https://reviews.apache.org/r/11464/ is updated with the new patch.
        Hide
        Navis added a comment -

        +1, running test.

        Show
        Navis added a comment - +1, running test.
        Hide
        Navis added a comment -

        Committed to trunk, thanks Prasad!

        Show
        Navis added a comment - Committed to trunk, thanks Prasad!
        Hide
        Prasad Mujumdar added a comment -

        Thanks Navis!

        Show
        Prasad Mujumdar added a comment - Thanks Navis!
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-h0.21 #2127 (See https://builds.apache.org/job/Hive-trunk-h0.21/2127/)
        HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) (Revision 1489226)

        Result = FAILURE
        navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489226
        Files :

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java
        Show
        Hudson added a comment - Integrated in Hive-trunk-h0.21 #2127 (See https://builds.apache.org/job/Hive-trunk-h0.21/2127/ ) HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) (Revision 1489226) Result = FAILURE navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489226 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-hadoop2 #225 (See https://builds.apache.org/job/Hive-trunk-hadoop2/225/)
        HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) (Revision 1489226)

        Result = ABORTED
        navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489226
        Files :

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java
        Show
        Hudson added a comment - Integrated in Hive-trunk-hadoop2 #225 (See https://builds.apache.org/job/Hive-trunk-hadoop2/225/ ) HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) (Revision 1489226) Result = ABORTED navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489226 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java
        Hide
        Ashutosh Chauhan added a comment -

        This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.

        Show
        Ashutosh Chauhan added a comment - This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.

          People

          • Assignee:
            Prasad Mujumdar
            Reporter:
            Prasad Mujumdar
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development