Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44833

Spark Connect reattach when initial ExecutePlan didn't reach server doing too eager Reattach

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0
    • 4.0.0, 3.5.1
    • Connect
    • None

    Description

      In

      case ex: StatusRuntimeException
          if Option(StatusProto.fromThrowable(ex))
            .exists(_.getMessage.contains("INVALID_HANDLE.OPERATION_NOT_FOUND")) =>
        if (lastReturnedResponseId.isDefined) {
          throw new IllegalStateException(
            "OPERATION_NOT_FOUND on the server but responses were already received from it.",
            ex)
        }
        // Try a new ExecutePlan, and throw upstream for retry.
      ->  iter = rawBlockingStub.executePlan(initialRequest)
      ->  throw new GrpcRetryHandler.RetryException 

      we call executePlan, and throw RetryException to have an exception handled upstream.

      Then it goes to

      retry {
        if (firstTry) {
          // on first try, we use the existing iter.
          firstTry = false
        } else {
          // on retry, the iter is borked, so we need a new one
      ->    iter = rawBlockingStub.reattachExecute(createReattachExecuteRequest())
        } 

      and because it's not firstTry, immediately does reattach.

      This causes no failure - the reattach will work and attach to the query, the original executePlan will get detached. But it could be improved.

      Same issue is also present in python reattach.py.

      Attachments

        Activity

          People

            juliuszsompolski Juliusz Sompolski
            juliuszsompolski Juliusz Sompolski
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: