Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43221

Executor obtained error information

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.1, 3.2.0, 3.3.0
    • None
    • Block Manager

    Description

      Spark on Yarn Cluster

      When multiple executors exist on a node, and the same block exists on both executors, with some in memory and some on disk.

      Probabilistically, the executor failed to obtain the block,throw Exception:

      java.lang.ArrayIndexOutofBoundsException: 0

          at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:183)

       

      Next, I will replay the process of the problem occurring: 

      step 1:

      The executor requests the driver to obtain block information(locationsAndStatusOption). The input parameters are BlockId and the host of its own node. Please note that it does not carry port information

      line:1092

      step 2:

      On the driver side, the driver obtains all blockManagers holding the block based on the BlockId. For non remote shuffle scenarios, the driver will retrieve the first one with the blockId and blockManager from the locations

      Assuming that there are two BlockManagers holding the BlockId on this node, BM-1 holds the Block and stores it in memory, and BM-2 holds the Block and stores it in disk

      Assuming the returned status is of type memory and its disksize is 0

      line: 852, 856

      step 3:

      This method will return a BlockLocationsAndStatus object. If there are BMs using disk, the disk's path information will be stored in localDirs

      step 4:

      When the executor obtains locationsAndStatusOption, localDirs is not empty, but status.diskSize is 0

      line: 1102

      step 5:

      The readDiskBlockFromSameHostExecutor only determines whether the Block file exists, and then directly uses the incoming blocksize to read the byte array. If the blocksize is 0, it returns an empty byte array

      Only checked if the file exists

      line: 1234, 1240

      Taking values from an empty array, causing an out of bounds problem

      Attachments

        1. image-2023-04-21-00-19-58-021.png
          64 kB
          Qiang Yang
        2. image-2023-04-21-00-24-22-059.png
          62 kB
          Qiang Yang
        3. image-2023-04-21-00-30-41-851.png
          74 kB
          Qiang Yang
        4. image-2023-04-21-00-50-10-918.png
          99 kB
          Qiang Yang
        5. image-2023-04-21-00-53-20-720.png
          101 kB
          Qiang Yang
        6. image-2023-04-21-00-54-11-968.png
          128 kB
          Qiang Yang
        7. image-2023-04-21-00-57-29-140.png
          129 kB
          Qiang Yang

        Activity

          People

            Unassigned Unassigned
            yorksity Qiang Yang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified