Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23438

DStreams could lose blocks with WAL enabled when driver crashes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 2.0.3, 2.1.3, 2.2.2, 2.3.1, 2.4.0
    • Component/s: DStreams
    • Labels:
      None

      Description

      There is a race condition introduced in SPARK-11141 which could cause data loss.

      This affects all versions since 1.6.0.

      Problematic situation:

      1. Start streaming job with 2 receivers with WAL enabled.
      2. Receiver 1 receives a block and does the following
        • Writes a BlockAdditionEvent into WAL
        • Puts the block into it's received block queue with ID 1
      1. Receiver 2 receives a block and does the following
        • Writes a BlockAdditionEvent into WAL
      1. Spark allocates all blocks from it's received block queue and writes AllocatedBlocks(IDs=(1)) into WAL
      2. Driver crashes
      3. New Driver recovers from WAL
      4. Realise block with ID 2 never processed

       

        Attachments

          Activity

            People

            • Assignee:
              gsomogyi Gabor Somogyi
              Reporter:
              gsomogyi Gabor Somogyi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: