Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-9932

Improper synchronization in RetryCache

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 2.1.1-beta
    • None
    • None
    • Reviewed

    Description

      In LightWeightCache#evictExpiredEntries(), the precondition check can fail. patrick white ran a HA failover test and it occurred while the SBN was catching up with edits during a transition to active. This caused NN to terminate.

      Here is my theory: If an RPC handler calls waitForCompletion() and it happens to remove the head of the queue in get(), it will race with evictExpiredEntries() frrom put().

      Attachments

        1. HADOOP-9932.patch
          2 kB
          Kihwal Lee
        2. HADOOP-9932.patch
          2 kB
          Kihwal Lee

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kihwal Kihwal Lee
            kihwal Kihwal Lee
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment