Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-9455

MapJoin task shouldn't start if HashTableSink task failed [Spark Branch]

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: spark-branch
    • Fix Version/s: None
    • Component/s: Spark
    • Labels:
      None

      Description

      While playing with auto_join25.q, I noticed that even though the task for hash table sink failed, HOS will still continue launch the task for map join. This is not the desired result. Instead, like MR, we should abandon the second task.

      Console output:

      Total jobs = 2
      Launching Job 1 out of 2
      In order to change the average load for a reducer (in bytes):
        set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
        set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
        set mapreduce.job.reduces=<number>
      
      Query Hive on Spark job[0] stages:
      0
      
      Status: Running (Hive on Spark job[0])
      Job Progress Format
      CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]
      2015-01-23 16:18:14,604	Stage-0_0: 0/1
      2015-01-23 04:18:14	Processing rows:	4	Hashtable size:	3	Memory usage:	119199408	percentage:	0.25
      2015-01-23 16:18:15,611	Stage-0_0: 0(+0,-1)/1
      Status: Finished successfully in 1.07 seconds
      Launching Job 2 out of 2
      In order to change the average load for a reducer (in bytes):
        set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
        set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
        set mapreduce.job.reduces=<number>
      2015-01-23 16:22:27,854	Stage-1_0: 0(+0,-1)/1
      Status: Finished successfully in 1.01 seconds
      Loading data to table default.dest1
      Table default.dest1 stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0]
      OK
      Time taken: 311.979 seconds
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              csun Chao Sun
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: