Hive
  1. Hive
  2. HIVE-1582

merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      hive>
      >
      >
      > SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
      hive> SET hive.exec.compress.output=false;
      hive> INSERT OVERWRITE DIRECTORY 'xxxxx'
      > SELECT yyyy from a;
      Total MapReduce jobs = 2
      Launching Job 1 out of 2
      Number of reduce tasks is set to 0 since there's no reduce operator
      ......
      Ended Job = job_201008191557_54169
      Ended Job = 450290112, job is filtered out (removed at runtime).
      Launching Job 2 out of 2
      .....

      the second job should not get started.

        Activity

        Hide
        Ning Zhang added a comment -

        Taked to Namit and Yongqiang, this is not a bug. INSERT OVERWRITE to (HDFS) directory should be merged as before. INSERT OVERWRITE LOCAL DIRECTORY cannot be merged and this is not the case.

        Show
        Ning Zhang added a comment - Taked to Namit and Yongqiang, this is not a bug. INSERT OVERWRITE to (HDFS) directory should be merged as before. INSERT OVERWRITE LOCAL DIRECTORY cannot be merged and this is not the case.
        Hide
        Ning Zhang added a comment -

        @namit, merging happens even before HIVE-1307. There does not seems to exist a unit test for this feature – no merge for inserting to directory). BTW, what's the rationale behind this?

        Show
        Ning Zhang added a comment - @namit, merging happens even before HIVE-1307 . There does not seems to exist a unit test for this feature – no merge for inserting to directory). BTW, what's the rationale behind this?
        Hide
        Namit Jain added a comment -

        @Ning, there should be no merge job for insert directory, we only used to merge for inserting into tables and partitions before

        Show
        Namit Jain added a comment - @Ning, there should be no merge job for insert directory, we only used to merge for inserting into tables and partitions before
        Hide
        Ning Zhang added a comment -

        I'm confused. Do you mean the second job should not be started or the second job should not be filtered out? I've tested the behaviors before and after HIVE-1307, and they are the same and always fires the merge job.

        Show
        Ning Zhang added a comment - I'm confused. Do you mean the second job should not be started or the second job should not be filtered out? I've tested the behaviors before and after HIVE-1307 , and they are the same and always fires the merge job.
        Hide
        He Yongqiang added a comment -

        Ended Job = 450290112, job is filtered out (removed at runtime).

        the second job seems be filtered out at runtime

        Show
        He Yongqiang added a comment - Ended Job = 450290112, job is filtered out (removed at runtime). the second job seems be filtered out at runtime
        Hide
        Ning Zhang added a comment -

        Is hive.merge.mapfiles=true? If so the second merge job should be fired. Am I missing something?

        Show
        Ning Zhang added a comment - Is hive.merge.mapfiles=true? If so the second merge job should be fired. Am I missing something?

          People

          • Assignee:
            Unassigned
            Reporter:
            He Yongqiang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development