Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-9455

MapJoin task shouldn't start if HashTableSink task failed [Spark Branch]

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • spark-branch
    • None
    • Spark
    • None

    Description

      While playing with auto_join25.q, I noticed that even though the task for hash table sink failed, HOS will still continue launch the task for map join. This is not the desired result. Instead, like MR, we should abandon the second task.

      Console output:

      Total jobs = 2
      Launching Job 1 out of 2
      In order to change the average load for a reducer (in bytes):
        set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
        set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
        set mapreduce.job.reduces=<number>
      
      Query Hive on Spark job[0] stages:
      0
      
      Status: Running (Hive on Spark job[0])
      Job Progress Format
      CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]
      2015-01-23 16:18:14,604	Stage-0_0: 0/1
      2015-01-23 04:18:14	Processing rows:	4	Hashtable size:	3	Memory usage:	119199408	percentage:	0.25
      2015-01-23 16:18:15,611	Stage-0_0: 0(+0,-1)/1
      Status: Finished successfully in 1.07 seconds
      Launching Job 2 out of 2
      In order to change the average load for a reducer (in bytes):
        set hive.exec.reducers.bytes.per.reducer=<number>
      In order to limit the maximum number of reducers:
        set hive.exec.reducers.max=<number>
      In order to set a constant number of reducers:
        set mapreduce.job.reduces=<number>
      2015-01-23 16:22:27,854	Stage-1_0: 0(+0,-1)/1
      Status: Finished successfully in 1.01 seconds
      Loading data to table default.dest1
      Table default.dest1 stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0]
      OK
      Time taken: 311.979 seconds
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: