HBase
  1. HBase
  2. HBASE-2643

Figure how to deal with eof splitting logs

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.89.20100621
    • Fix Version/s: 0.89.20100924, 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When splitting the WAL and encountering EOF, it's not clear what to do. Initial discussion of this started in http://review.hbase.org/r/74/ - summarizing here for brevity:

      We can get an EOFException while splitting the WAL in the following cases:

      • The writer died after creating the file but before even writing the header (or crashed halfway through writing the header)
      • The writer died in the middle of flushing some data - sync() guarantees that we can see at least the last edit, but we may see half of an edit that was being written out when the RS crashed (especially for large rows)
      • The data was actually corrupted somehow (eg a length field got changed to be too long and thus points past EOF)

      Ideally we would know when we see EOF whether it was really the last record, and in that case, simply drop that record (it wasn't synced, so therefore we dont need to split it). Some open questions:

      • Currently we ignore empty files. Is it ok to ignore an empty log file if it's not the last one?
      • Similarly, do we ignore an EOF mid-record if it's not the last log file?
      1. HBASE-2643.patch
        4 kB
        Nicolas Spiegelberg
      2. ch03s02.html
        4 kB
        stack

        Issue Links

          Activity

            People

            • Assignee:
              Nicolas Spiegelberg
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development