Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1623 High Availability Framework for HDFS NN
  3. HDFS-2709

HA: Appropriately handle error conditions in EditLogTailer

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • HA branch (HDFS-1623)
    • HA branch (HDFS-1623)
    • ha, namenode
    • None

    Description

      Currently if the edit log tailer experiences an error replaying edits in the middle of a file, it will go back to retrying from the beginning of the file on the next tailing iteration. This is incorrect since many of the edits will have already been replayed, and not all edits are idempotent.

      Instead, we either need to (a) support reading from the middle of a finalized file (ie skip those edits already applied), or (b) abort the standby if it hits an error while tailing. If "a" isn't simple, let's do "b" for now and come back to 'a' later since this is a rare circumstance and better to abort than be incorrect.

      Attachments

        1. HDFS-2709-HDFS-1623.patch
          21 kB
          Aaron Myers
        2. HDFS-2709-HDFS-1623.patch
          25 kB
          Aaron Myers
        3. HDFS-2709-HDFS-1623.patch
          31 kB
          Aaron Myers
        4. HDFS-2709-HDFS-1623.patch
          58 kB
          Aaron Myers
        5. HDFS-2709-HDFS-1623.patch
          60 kB
          Aaron Myers
        6. HDFS-2709-HDFS-1623.patch
          62 kB
          Aaron Myers
        7. HDFS-2709-HDFS-1623.patch
          64 kB
          Aaron Myers
        8. HDFS-2709-HDFS-1623.patch
          64 kB
          Aaron Myers
        9. HDFS-2709-HDFS-1623.patch
          64 kB
          Aaron Myers

        Issue Links

          Activity

            People

              atm Aaron Myers
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: