Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24734

Sanity check in HiveSplitGenerator available slot calculation

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.0.0
    • Fix Version/s: None
    • Component/s: Tez
    • Labels:
      None
    • Target Version/s:

      Description

      HiveSplitGenerator calculates the number of available slots from available memory like this:

      if (getContext() != null) {
        totalResource = getContext().getTotalAvailableResource().getMemory();
        taskResource = getContext().getVertexTaskResource().getMemory();
        availableSlots = totalResource / taskResource;
      }
      

      I had a scenario where the total memory was calculated correctly, but the task memory returned -1. This led to error like these:

      tez.HiveSplitGenerator: Number of input splits: 1. -3641 available slots, 1.7 waves. Input format is: org.apache.hadoop.hive.ql.io.HiveInputFormat
      
      Estimated number of tasks: -6189 for bucket 1
      
      java.lang.IllegalArgumentException: Illegal Capacity: -6189
      

      Admittedly, this happened during development, and hopefully will not occur on a properly configured cluster. (Although I'm not sure what the issue was on my setup, possibly XMX set higher than physical memory.)

      In any case, it feels like setting availableSlots < 1 will never lead to desired behavior, so in such cases we could emit a warning and correct the value to 1.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              zmatyus Zoltan Matyus
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: