Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3814

Inserts into a bucketed table fail randomly with Hive on Tez

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.7.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The MAP phase for Inserts into a bucketed table randomly fails with the error "Vertex <vertex_id> [Map 1] failed as task <task_id> failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0".

      The task fails because it fails for all attempts with "<attempt_id> being failed for too many output errors. failureFraction=0.2, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0"

      This happens more often if the table is ACID enabled and a delete operation is performed before the inserts.

      I have tried the following:

      Changed tez.am.launch.cmd-opts, tez.task.launch.cmd-opts and hive.tez.java.opts to use parallel GC.
      tez.runtime.shuffle.max.allowed.failed.fetch.fraction = 0.95
      tez.runtime.shuffle.failed.check.since-last.completion=false
      tez.runtime.shuffle.fetch.buffer.percent = 0.1
      tez.runtime.shuffle.memory.limit.percent = 0.25
      tez.runtime.shuffle.ssl.enable=false
      Deleted ".../usercache/<user>/filecache" and ".../usercache/<user>/appcache"

      I am using HDP 2.6 dsitribution.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              infinitymittal Anant Mittal
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: