HBase
  1. HBase
  2. HBASE-4107

OOME while writing WAL checksum causes corrupt WAL

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.90.1
    • Fix Version/s: None
    • Component/s: regionserver, wal
    • Labels:
      None
    • Environment:

      CentOS 5.5x64

      Description

      An issue was observed where upon shutdown of a regionserver the regionserver log was corrupt. It appears from the following stacktrace that an Java heap memory exception occurred while writing the checksum to the WAL. Corrupting the WAL can potentially cause data loss.

      2011-07-14 14:54:53,741 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
      java.io.IOException: Reflection
      at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:147)
      at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:987)
      at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:964)
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.GeneratedMethodAccessor1336.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:145)
      ... 2 more
      Caused by: java.lang.OutOfMemoryError: Java heap space
      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2375)
      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3271)
      at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
      at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3354)
      at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
      at org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944)
      ... 6 more

      1. regionserver.oom.log
        4 kB
        Dave Latham
      2. master.splitting.log
        12 kB
        Dave Latham

        Activity

        Hide
        Dave Latham added a comment -

        We recently hit the same issue. I am able to download the corrupt WAL log file through the HDFS web interface, but cannot post it publicly.

        Show
        Dave Latham added a comment - We recently hit the same issue. I am able to download the corrupt WAL log file through the HDFS web interface, but cannot post it publicly.
        Hide
        stack added a comment -

        @Dave you need a walplayer (HBASE-3619). Someone of us needs to hack it up. I don't think it'd be too hard to do.

        @Andy Do you think it the checksum'ing that caused the OOME? That might just be where we tripped over the OOME first. I'd more suspect the HFile$Reader.decompress that is going on concurrently. Looks like the master splitting the log passed the data missing the checksum, so yeah, we lost that bit of data on the tail of WAL around time of OOME (My analysis here is cursory just looking at logs. I did not do confirmation by digging in code).

        Show
        stack added a comment - @Dave you need a walplayer ( HBASE-3619 ). Someone of us needs to hack it up. I don't think it'd be too hard to do. @Andy Do you think it the checksum'ing that caused the OOME? That might just be where we tripped over the OOME first. I'd more suspect the HFile$Reader.decompress that is going on concurrently. Looks like the master splitting the log passed the data missing the checksum, so yeah, we lost that bit of data on the tail of WAL around time of OOME (My analysis here is cursory just looking at logs. I did not do confirmation by digging in code).
        Hide
        Dave Latham added a comment -

        It looks like HLog main has support to invoke --split. Does it looks like if I call that on the log that it will put split it and put the data into the right place?

        We had a handful of regionservers go OOM yesterday while a MR job was doing heavy writes to a column family that doesn't usually get them. In this case, the first OOM occurred here during writing the checksum.

        Show
        Dave Latham added a comment - It looks like HLog main has support to invoke --split. Does it looks like if I call that on the log that it will put split it and put the data into the right place? We had a handful of regionservers go OOM yesterday while a MR job was doing heavy writes to a column family that doesn't usually get them. In this case, the first OOM occurred here during writing the checksum.
        Hide
        stack added a comment -

        @Dave That might work. You could see what regions had files added to their recovered.edits and then you could do a reassign of the regions and on open in their new locations, they should play the recovered.edits. If you try it, let us know how it goes. Good stuff.

        Show
        stack added a comment - @Dave That might work. You could see what regions had files added to their recovered.edits and then you could do a reassign of the regions and on open in their new locations, they should play the recovered.edits. If you try it, let us know how it goes. Good stuff.
        Hide
        Andy Sautins added a comment -

        The behavior that Dave is seeing is what we were seeing as well. It looks like objects are created from within the call to HLog.sync, specifically in our case DFSClient was creating a new Packet object and tried to allocate a byte array it couldn't allocate.

        For the time being gotten around this issue by increasing the heap from 2G to 4G on our regionservers. That seem to resolve it for us for now. I'll look into it again, but it seems like an unfortunate situation where the data is written, but the checksum isn't able to be written due to an OOM. It seems like possibly changing DFSClient to use a pool of pre-allocated Packet objects for writing might address this, but I'm not sure I fully grasp the full problem yet.

        Show
        Andy Sautins added a comment - The behavior that Dave is seeing is what we were seeing as well. It looks like objects are created from within the call to HLog.sync, specifically in our case DFSClient was creating a new Packet object and tried to allocate a byte array it couldn't allocate. For the time being gotten around this issue by increasing the heap from 2G to 4G on our regionservers. That seem to resolve it for us for now. I'll look into it again, but it seems like an unfortunate situation where the data is written, but the checksum isn't able to be written due to an OOM. It seems like possibly changing DFSClient to use a pool of pre-allocated Packet objects for writing might address this, but I'm not sure I fully grasp the full problem yet.
        Hide
        Lars Hofhansl added a comment -

        Is this still an issue, or can we close this?

        Show
        Lars Hofhansl added a comment - Is this still an issue, or can we close this?
        Hide
        Andy Sautins added a comment -

        I believe we can close it. This issue we were experiencing was resolved for us after apply ing HDFS-2379.

        Show
        Andy Sautins added a comment - I believe we can close it. This issue we were experiencing was resolved for us after apply ing HDFS-2379 .

          People

          • Assignee:
            Unassigned
            Reporter:
            Andy Sautins
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development