HBase
  1. HBase
  2. HBASE-2643

Figure how to deal with eof splitting logs

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.89.20100621
    • Fix Version/s: 0.89.20100924, 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When splitting the WAL and encountering EOF, it's not clear what to do. Initial discussion of this started in http://review.hbase.org/r/74/ - summarizing here for brevity:

      We can get an EOFException while splitting the WAL in the following cases:

      • The writer died after creating the file but before even writing the header (or crashed halfway through writing the header)
      • The writer died in the middle of flushing some data - sync() guarantees that we can see at least the last edit, but we may see half of an edit that was being written out when the RS crashed (especially for large rows)
      • The data was actually corrupted somehow (eg a length field got changed to be too long and thus points past EOF)

      Ideally we would know when we see EOF whether it was really the last record, and in that case, simply drop that record (it wasn't synced, so therefore we dont need to split it). Some open questions:

      • Currently we ignore empty files. Is it ok to ignore an empty log file if it's not the last one?
      • Similarly, do we ignore an EOF mid-record if it's not the last log file?
      1. HBASE-2643.patch
        4 kB
        Nicolas Spiegelberg
      2. ch03s02.html
        4 kB
        stack

        Issue Links

          Activity

          Nicolas Spiegelberg made changes -
          Link This issue relates to HBASE-2889 [ HBASE-2889 ]
          Jean-Daniel Cryans made changes -
          Fix Version/s 0.89.20100924 [ 12315366 ]
          stack made changes -
          Attachment ch03s02.html [ 12453782 ]
          stack made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Nicolas Spiegelberg made changes -
          Attachment HBASE-2643.patch [ 12453063 ]
          Nicolas Spiegelberg made changes -
          Attachment HBASE-2643.patch [ 12453070 ]
          Nicolas Spiegelberg made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 0.89.20100621 [ 12315193 ]
          Nicolas Spiegelberg made changes -
          Attachment HBASE-2643.patch [ 12453063 ]
          Nicolas Spiegelberg made changes -
          Link This issue is related to HBASE-2935 [ HBASE-2935 ]
          Nicolas Spiegelberg made changes -
          Assignee Nicolas Spiegelberg [ nspiegelberg ]
          Nicolas Spiegelberg made changes -
          Link This issue is related to HBASE-2933 [ HBASE-2933 ]
          Todd Lipcon made changes -
          Field Original Value New Value
          Description During review of 2437, lots of discussion around how to deal with eof. This issue is about making decision about how to do eof treatment when reading wals.

          Below is copied from http://review.hbase.org/r/74/

          {code}
          yes, I think so. The RS could have crashed right after opening but before writing any data, and if the master failed to recover that, then we'd never recover that region. I say ignore with a WARN
          Cosmin Lehene 6 days, 18 hours ago (May 25th, 2010, 2:27 p.m.)
          more aspects here:
          I think the reported size will be >0 after recover, even if file has no records. I was asking if we should add logic to check if it's the last log.
          EOF for non zero length, non zero records file means file is corrupted.
          Todd Lipcon 5 days, 19 hours ago (May 26th, 2010, 1:27 p.m.)
          I agree if it has no records (I think - do we syncfs after writing the sequencefile header?). But there's the case where inside SequenceFile we call create, but never actually write any bytes. This is still worth recovering.

          In general I think a corrupt tail means we should drop that record (incomplete written record) but not shut down. This is only true if it's the tail record, though.
          Cosmin Lehene 3 days ago (May 29th, 2010, 8:56 a.m.)
           - How can we determine it's the tail record or the 5th out of 10 records that's broken? We just get an EOF when calling next()
           - Currently we ignore empty files. Is it ok to ignore an empty log file if it's not the last one?
           - I'm not sure whether it's possible to get an EOF when acquiring the reader for a file after it has been recoverFileLease()-ed. So the whole try/catch for HLog.getReader might be redundant.

          When reading log entries we currently don't catch. We read as much as we can and then let any exception bubble up. splitLog logic decides what to do next: If we get to a broken record it will most probably throw an EOF in there and based on skip.errors setting it will act accordingly. There will be no EOF if there are no records, though and we continue.

          There are two possible reasons for a file being corrupted/empty:
          1 HRegion died => only the last log entry (edit) in the last log in the directory should be affected => we could continue but are we sure it's the tail record?
          2 Another component screwed things up (bug) => other logs than the last one could be affected => we should halt in this situation.
          Add comment
          src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java (Diff revision 1)
          1455
                  throw e;
          see above logic - writer could have crashed after writing only part of the sequencefile header, etc, so we should just warn and continue
          Cosmin Lehene 6 days, 18 hours ago (May 25th, 2010, 2:27 p.m.)
          see above comment
          Add comment
          src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java (Diff revision 1)
          1471
              } finally {
          I think we need to handle EOF specially here too, though OK to leave this for another JIRA. IIRC one of the FB guys opened this already
          Cosmin Lehene 6 days, 18 hours ago (May 25th, 2010, 2:30 p.m.)
          what's the other JIRA? see my above comments.
          Todd Lipcon 5 days, 19 hours ago (May 26th, 2010, 1:27 p.m.)
          Can't find it now... does my above comment make sense?
          {code}
          When splitting the WAL and encountering EOF, it's not clear what to do. Initial discussion of this started in http://review.hbase.org/r/74/ - summarizing here for brevity:

          We can get an EOFException while splitting the WAL in the following cases:
          - The writer died after creating the file but before even writing the header (or crashed halfway through writing the header)
          - The writer died in the middle of flushing some data - sync() guarantees that we can see _at least_ the last edit, but we may see half of an edit that was being written out when the RS crashed (especially for large rows)
          - The data was actually corrupted somehow (eg a length field got changed to be too long and thus points past EOF)

          Ideally we would know when we see EOF whether it was really the last record, and in that case, simply drop that record (it wasn't synced, so therefore we dont need to split it). Some open questions:
            - Currently we ignore empty files. Is it ok to ignore an empty log file if it's not the last one?
            - Similarly, do we ignore an EOF mid-record if it's not the last log file?
          stack created issue -

            People

            • Assignee:
              Nicolas Spiegelberg
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development