Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1540

RpcCallback must be able to handle TimeoutException or cancel.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: None
    • Labels:
      None

      Description

      I investigated the lock of CallFuture while reviewing TAJO-1469. CallFuture should be synchronized with run() and get(). Current code looks like this would be implemented but not. If the following situation is occur, some resources or tasks will be lost forever.

      Worker: TaskRunner sends GetTask request.
      QM: QueryMaster selects proper task and calls RpcCallback.
      Worker: AsyncRpcClient receives the response and calls CallFuture.run(response). 3-1. Worker: If TimeoutException occurs after 1) between 2) ~ 3), TaskRunner can't receive the response and doesn't run the allocated task, but QM doesn't know about that.

      We should fix this problem in the RPC module and add a right cancel logic.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                hjkim Hyoungjun Kim
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: