Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1623 High Availability Framework for HDFS NN
  3. HDFS-2709

HA: Appropriately handle error conditions in EditLogTailer

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • HA branch (HDFS-1623)
    • HA branch (HDFS-1623)
    • ha, namenode
    • None

    Description

      Currently if the edit log tailer experiences an error replaying edits in the middle of a file, it will go back to retrying from the beginning of the file on the next tailing iteration. This is incorrect since many of the edits will have already been replayed, and not all edits are idempotent.

      Instead, we either need to (a) support reading from the middle of a finalized file (ie skip those edits already applied), or (b) abort the standby if it hits an error while tailing. If "a" isn't simple, let's do "b" for now and come back to 'a' later since this is a rare circumstance and better to abort than be incorrect.

      Attachments

        1. HDFS-2709-HDFS-1623.patch
          21 kB
          Aaron Myers
        2. HDFS-2709-HDFS-1623.patch
          25 kB
          Aaron Myers
        3. HDFS-2709-HDFS-1623.patch
          31 kB
          Aaron Myers
        4. HDFS-2709-HDFS-1623.patch
          58 kB
          Aaron Myers
        5. HDFS-2709-HDFS-1623.patch
          60 kB
          Aaron Myers
        6. HDFS-2709-HDFS-1623.patch
          62 kB
          Aaron Myers
        7. HDFS-2709-HDFS-1623.patch
          64 kB
          Aaron Myers
        8. HDFS-2709-HDFS-1623.patch
          64 kB
          Aaron Myers
        9. HDFS-2709-HDFS-1623.patch
          64 kB
          Aaron Myers

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            atm Aaron Myers
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment