Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-760

HDFS edits log file corrupted can lead to a major loss of data.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • 0.6.1
    • None
    • None
    • None

    Description

      In one of our test system, our HDFS gets corrupted after the edits log file has been corrupted (i can tell how).

      When we restarted the HDFS, the namenode refusses to started with a exception in hadoop-namenode-xxx.out.

      Unfortunately, a rm mistake has been done, and I was not able to save somewhere this exception.

      But it was an ArrayIndexOutOfBoundException somewhere in a UTF8 method called from FSEditLog.loadFSEdits.

      The result : the namenode was unable to start, the only way to get it fixed was the removing of the edits log file.

      As it was on a test machine we do not have any backup, so all files created in the hdfs since the last start of the namenode were lost.

      Is there a way to periodically commit changes to the hdfs in fsimage instead of keeping a huge logfile ? (eg every 10 minutes or so.)

      Even if the namenode files are rsync'ed, what can be done in that particular case ? (if we periodically rsync the fsimage and its corrupted edits file).

      This issue affects the 0.6.1 HDFS version. After looking at the hadoop trunk code, I am not able to says if this can be happening anymore... (I would say yes because of the use of UTF8 class in the same way as in 0.6.1)

      Attachments

        Activity

          People

            Unassigned Unassigned
            phil@anyware-tech.com Philippe Gassmann
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: