HBase
  1. HBase
  2. HBASE-11620

Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.98.4
    • Fix Version/s: 0.99.0, 0.98.5, 2.0.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Reported by Kiran in this thread: "HBase file encryption, inconsistencies observed and data loss"

      After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs.

      Following is observed in log files:

      2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172
      2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false
      2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
      2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms
      2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
      2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
      2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
      2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream
      2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down.
      2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish
      2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished
      2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false
      

      To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix?

      -------- (end of quote from Kiran)

      In BaseDecoder#rethrowEofException() :

          if (!isEof) throw ioEx;
          LOG.error("Partial cell read caused by EOF: " + ioEx);
          EOFException eofEx = new EOFException("Partial cell read");
          eofEx.initCause(ioEx);
          throw eofEx;
      

      throwing EOFException would not propagate the "Partial cell read" condition to HLogSplitter which doesn't treat EOFException as an error.

      I think IOException should be thrown above - HLogSplitter#getNextLogLine() would translate the IOEx to CorruptedLogFileException.

      1. 11620-0.98-v6.txt
        29 kB
        Ted Yu
      2. 11620-0.98-v7.txt
        29 kB
        Ted Yu
      3. 11620-v1.txt
        0.8 kB
        Ted Yu
      4. 11620-v2.txt
        20 kB
        Ted Yu
      5. 11620-v3.txt
        25 kB
        Ted Yu
      6. 11620-v4.txt
        26 kB
        Ted Yu
      7. 11620-v5.txt
        27 kB
        Ted Yu
      8. 11620-v6.txt
        30 kB
        Ted Yu
      9. 11620-v6.txt
        30 kB
        Ted Yu
      10. 11620-v7.txt
        30 kB
        Ted Yu

        Activity

        Ted Yu created issue -
        Ted Yu made changes -
        Field Original Value New Value
        Description Reported by Kiran in this thread: "HBase file encryption, inconsistencies observed and data loss"

        After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs.

        Following is observed in log files:
        {code}
        2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172
        2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false
        2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
        2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms
        2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
        2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
        2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
        2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down.
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false
        {code}
        To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix?

        -------- (end of quote from Kiran)

        In BaseDecoder#rethrowEofException() :
        {code}
            if (!isEof) throw ioEx;
            LOG.error("Partial cell read caused by EOF: " + ioEx);
            EOFException eofEx = new EOFException("Partial cell read");
            eofEx.initCause(ioEx);
            throw eofEx;
        {code}
        throwing EOFException would not propagate the "Partial cell read" condition to HLogSplitter which doesn't treat EOFException as an error.

        I think a new exception type (DecoderException e.g.) should be used above.
        Reported by Kiran in this thread: "HBase file encryption, inconsistencies observed and data loss"

        After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs.

        Following is observed in log files:
        {code}
        2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172
        2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false
        2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
        2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms
        2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
        2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
        2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
        2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down.
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished
        2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false
        {code}
        To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix?

        -------- (end of quote from Kiran)

        In BaseDecoder#rethrowEofException() :
        {code}
            if (!isEof) throw ioEx;
            LOG.error("Partial cell read caused by EOF: " + ioEx);
            EOFException eofEx = new EOFException("Partial cell read");
            eofEx.initCause(ioEx);
            throw eofEx;
        {code}
        throwing EOFException would not propagate the "Partial cell read" condition to HLogSplitter which doesn't treat EOFException as an error.

        I think IOException should be thrown above - HLogSplitter#getNextLogLine() would translate the IOEx to CorruptedLogFileException.
        Hide
        Ted Yu added a comment -

        Tentative patch.

        Show
        Ted Yu added a comment - Tentative patch.
        Ted Yu made changes -
        Attachment 11620-v1.txt [ 12658738 ]
        Ted Yu made changes -
        Affects Version/s 0.98.4 [ 12326810 ]
        Hide
        ramkrishna.s.vasudevan added a comment -

        I think a new exception type (DecoderException e.g.) should be used above.

        No it cannot be. I think that EOF case was added for some specific cases where there is a real file with no entry.

        Show
        ramkrishna.s.vasudevan added a comment - I think a new exception type (DecoderException e.g.) should be used above. No it cannot be. I think that EOF case was added for some specific cases where there is a real file with no entry.
        Hide
        Andrew Purtell added a comment - - edited

        Let's not forget the original sin of changing the WAL reader+writer implementation classes after a crash and before a restart. That cannot and should not be acceptable practice.

        Ted Yu, this could work if you can come up with a way for a codec to reliably tell the difference between EOF and corruption for some other reason. Just propagating EOF to the splitter seems contrary to current expected behavior.

        Edit: Or, deal with the actual user action here and consider a new optional field in the pbuf WAL header that carries the name of the class that wrote it. A reader can check if it can handle the output of that writer when the header is being read. The error at that point would be unambiguous.

        Show
        Andrew Purtell added a comment - - edited Let's not forget the original sin of changing the WAL reader+writer implementation classes after a crash and before a restart. That cannot and should not be acceptable practice. Ted Yu , this could work if you can come up with a way for a codec to reliably tell the difference between EOF and corruption for some other reason. Just propagating EOF to the splitter seems contrary to current expected behavior. Edit: Or, deal with the actual user action here and consider a new optional field in the pbuf WAL header that carries the name of the class that wrote it. A reader can check if it can handle the output of that writer when the header is being read. The error at that point would be unambiguous.
        Hide
        Anoop Sam John added a comment -

        +1 for Andy's suggestion

        Show
        Anoop Sam John added a comment - +1 for Andy's suggestion
        Hide
        ramkrishna.s.vasudevan added a comment -

        +1 for the suggestion. That would be the ideal one. So in these cases the user could make wal encryption false but should have the same reader and writer.?

        Show
        ramkrishna.s.vasudevan added a comment - +1 for the suggestion. That would be the ideal one. So in these cases the user could make wal encryption false but should have the same reader and writer.?
        Hide
        Kiran Kumar M R added a comment -

        I tested patch submitted by Ted Yu, its not working. Even though ioException is through instead of EOF, it is still not considered as corrupt.

        Here are the logs. Refer line with Throwing ioEx instead of eofEx

        2014-07-31 21:19:11,923 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362, length=174
        2014-07-31 21:19:11,923 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: DistributedLogReplay = false
        2014-07-31 21:19:11,994 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362
        2014-07-31 21:19:11,996 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362 after 2ms
        2014-07-31 21:19:12,009 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-0,5,main]: starting
        2014-07-31 21:19:12,009 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-2,5,main]: starting
        2014-07-31 21:19:12,009 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-1,5,main]: starting
        2014-07-31 21:19:12,170 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-0] codec.BaseDecoder: Partial cell read caused by EOF - Throwing ioEx instead of eofEx : java.io.IOException: Premature EOF from inputStream
        2014-07-31 21:19:12,170 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Finishing writing output logs and closing down.
        2014-07-31 21:19:12,170 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Waiting for split writer threads to finish
        2014-07-31 21:19:12,170 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Split writers finished
        2014-07-31 21:19:12,171 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362 is corrupted = false progress failed = false
        2014-07-31 21:19:12,202 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] handler.HLogSplitterHandler: successfully transitioned task /hbase/splitWAL/WALs%2FHOST-10-18-40-16%2C15264%2C1406821527620-splitting%2FHOST-10-18-40-16%252C15264%252C1406821527620.1406821561362 to final state DONE HOST-16,15264,1406821739918
        2014-07-31 21:19:12,202 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-0] handler.HLogSplitterHandler: worker HOST-16,15264,1406821739918 done with task /hbase/splitWAL/WALs%2FHOST-10-18-40-16%2C15264%2C1406821527620-splitting%2FHOST-10-18-40-16%252C15264%252C1406821527620.1406821561362 in 316ms
        
        Show
        Kiran Kumar M R added a comment - I tested patch submitted by Ted Yu, its not working. Even though ioException is through instead of EOF, it is still not considered as corrupt. Here are the logs. Refer line with Throwing ioEx instead of eofEx 2014-07-31 21:19:11,923 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Splitting hlog: hdfs: //HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362, length=174 2014-07-31 21:19:11,923 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: DistributedLogReplay = false 2014-07-31 21:19:11,994 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] util.FSHDFSUtils: Recovering lease on dfs file hdfs: //HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362 2014-07-31 21:19:11,996 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] util.FSHDFSUtils: recoverLease= true , attempt=0 on file=hdfs: //HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362 after 2ms 2014-07-31 21:19:12,009 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-0] wal.HLogSplitter: Writer thread Thread [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-0,5,main]: starting 2014-07-31 21:19:12,009 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-2] wal.HLogSplitter: Writer thread Thread [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-2,5,main]: starting 2014-07-31 21:19:12,009 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-1] wal.HLogSplitter: Writer thread Thread [RS_LOG_REPLAY_OPS-HOST-16:15264-0-Writer-1,5,main]: starting 2014-07-31 21:19:12,170 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-0] codec.BaseDecoder: Partial cell read caused by EOF - Throwing ioEx instead of eofEx : java.io.IOException: Premature EOF from inputStream 2014-07-31 21:19:12,170 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Finishing writing output logs and closing down. 2014-07-31 21:19:12,170 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Waiting for split writer threads to finish 2014-07-31 21:19:12,170 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Split writers finished 2014-07-31 21:19:12,171 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs: //HOST-16:18020/hbase/WALs/HOST-16,15264,1406821527620-splitting/HOST-16%2C15264%2C1406821527620.1406821561362 is corrupted = false progress failed = false 2014-07-31 21:19:12,202 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] handler.HLogSplitterHandler: successfully transitioned task /hbase/splitWAL/WALs%2FHOST-10-18-40-16%2C15264%2C1406821527620-splitting%2FHOST-10-18-40-16%252C15264%252C1406821527620.1406821561362 to final state DONE HOST-16,15264,1406821739918 2014-07-31 21:19:12,202 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-0] handler.HLogSplitterHandler: worker HOST-16,15264,1406821739918 done with task /hbase/splitWAL/WALs%2FHOST-10-18-40-16%2C15264%2C1406821527620-splitting%2FHOST-10-18-40-16%252C15264%252C1406821527620.1406821561362 in 316ms
        Hide
        Ted Yu added a comment -

        What's the value for config "hbase.hlog.split.skip.errors" ?

        Thanks

        Show
        Ted Yu added a comment - What's the value for config "hbase.hlog.split.skip.errors" ? Thanks
        Hide
        Ted Yu added a comment - - edited

        I am composing new patch which adds new optional field in the pbuf WAL header.
        I plan to change the signature of the following method where the return value is an enum (ProtobufLogReader is makred InterfaceAudience.Private) :

          protected boolean readHeader(Builder builder, FSDataInputStream stream) throws IOException {
        
        Show
        Ted Yu added a comment - - edited I am composing new patch which adds new optional field in the pbuf WAL header. I plan to change the signature of the following method where the return value is an enum (ProtobufLogReader is makred InterfaceAudience.Private) : protected boolean readHeader(Builder builder, FSDataInputStream stream) throws IOException {
        Ted Yu made changes -
        Assignee Ted Yu [ yuzhihong@gmail.com ]
        Hide
        Ted Yu added a comment -

        Patch v2 adds optional field to WAL header.

        Running WAL related tests now.

        Show
        Ted Yu added a comment - Patch v2 adds optional field to WAL header. Running WAL related tests now.
        Ted Yu made changes -
        Attachment 11620-v2.txt [ 12658969 ]
        Ted Yu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Andrew Purtell added a comment -

        Fix version 0.98.5+

        Show
        Andrew Purtell added a comment - Fix version 0.98.5+
        Andrew Purtell made changes -
        Fix Version/s 0.99.0 [ 12325675 ]
        Fix Version/s 0.98.5 [ 12326953 ]
        Fix Version/s 2.0.0 [ 12327188 ]
        Hide
        Andrew Purtell added a comment -

        Please add a unit test that verifies this fix.

        Show
        Andrew Purtell added a comment - Please add a unit test that verifies this fix.
        Hide
        Ted Yu added a comment -

        Patch v3 adds a test.

        Show
        Ted Yu added a comment - Patch v3 adds a test.
        Ted Yu made changes -
        Attachment 11620-v3.txt [ 12659004 ]
        Hide
        Andrew Purtell added a comment - - edited

        That's an ok test. What do you think about one that writes an encrypted WAL then submits it to a HLogSplitter instance configured with ProtobufLogReader?
        Edit: Pardon, I should be a bit more specific. Confirm that the splitter fails to read the file in a way that it is moved to the corrupt logs dir instead of oldWALs.

        Show
        Andrew Purtell added a comment - - edited That's an ok test. What do you think about one that writes an encrypted WAL then submits it to a HLogSplitter instance configured with ProtobufLogReader? Edit: Pardon, I should be a bit more specific. Confirm that the splitter fails to read the file in a way that it is moved to the corrupt logs dir instead of oldWALs.
        Ted Yu made changes -
        Summary Propagate decoder exception to HLogSplitter so that loss of data is avoided Record the class name of Writer in WAL header so that only proper Reader can open the WAL file
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12658969/11620-v2.txt
        against trunk revision .
        ATTACHMENT ID: 12658969

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        +1 site. The mvn site goal succeeds with this patch.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.TestIOFencing
        org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
        org.apache.hadoop.hbase.regionserver.TestRegionReplicas
        org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas
        org.apache.hadoop.hbase.client.TestReplicasClient
        org.apache.hadoop.hbase.master.TestRestartCluster
        org.apache.hadoop.hbase.TestRegionRebalancing

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658969/11620-v2.txt against trunk revision . ATTACHMENT ID: 12658969 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10248//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        In patch v4, WAL is submitted to HLogSplitter.

        For corrupted hlog code path, I haven't found a proper way to verify.
        I observed the following in test output:

        2014-07-31 14:51:25,765 WARN  [main] wal.HLogSplitter(290): Could not get reader, corrupted log file file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910
        org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$CorruptedLogFileException: skipErrors=true Could not open hlog file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910 ignoring
                at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:602)
                at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:288)
                at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:231)
                at org.apache.hadoop.hbase.regionserver.wal.TestHLogReaderOnSecureHLog.testHLogReaderOnSecureHLog(TestHLogReaderOnSecureHLog.java:123)
        ...
        Caused by: java.io.IOException: Expected ProtobufLogWriter, got SecureProtobufLogWriter
                at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:140)
                at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:101)
                at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:69)
                at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:127)
                at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:90)
                at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:668)
                at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:577)
                ... 27 more
        2014-07-31 14:51:25,781 WARN  [main] wal.HLogSplitter(295): Nothing to split in log file file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910
        2014-07-31 14:51:25,781 DEBUG [main] wal.HLogSplitter(367): Finishing writing output logs and closing down.
        2014-07-31 14:51:25,781 INFO  [main] wal.HLogSplitter(380): Processed 0 edits across 0 regions; log file=file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910 is corrupted = true progress failed = false
        

        However, there was no exception coming out of HLogSplitter.splitLogFile.

        Show
        Ted Yu added a comment - In patch v4, WAL is submitted to HLogSplitter. For corrupted hlog code path, I haven't found a proper way to verify. I observed the following in test output: 2014-07-31 14:51:25,765 WARN [main] wal.HLogSplitter(290): Could not get reader, corrupted log file file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$CorruptedLogFileException: skipErrors= true Could not open hlog file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910 ignoring at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:602) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:288) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:231) at org.apache.hadoop.hbase.regionserver.wal.TestHLogReaderOnSecureHLog.testHLogReaderOnSecureHLog(TestHLogReaderOnSecureHLog.java:123) ... Caused by: java.io.IOException: Expected ProtobufLogWriter, got SecureProtobufLogWriter at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:140) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:101) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:69) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:127) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:90) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:668) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:577) ... 27 more 2014-07-31 14:51:25,781 WARN [main] wal.HLogSplitter(295): Nothing to split in log file file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910 2014-07-31 14:51:25,781 DEBUG [main] wal.HLogSplitter(367): Finishing writing output logs and closing down. 2014-07-31 14:51:25,781 INFO [main] wal.HLogSplitter(380): Processed 0 edits across 0 regions; log file=file:/Users/tyu/trunk/hbase-server/target/test-data/aae07de8-e64e-45fa-b6d6-6ede1bbccc9b/log/hlog.1406843484910 is corrupted = true progress failed = false However, there was no exception coming out of HLogSplitter.splitLogFile.
        Ted Yu made changes -
        Attachment 11620-v4.txt [ 12659020 ]
        Hide
        Andrew Purtell added a comment -

        You could check the HFile ends up in the corrupt log dir. That's the desired outcome, correct Kiran Kumar M R?

        Show
        Andrew Purtell added a comment - You could check the HFile ends up in the corrupt log dir. That's the desired outcome, correct Kiran Kumar M R ?
        Hide
        Ted Yu added a comment -

        Patch v5 verifies that unrecognized WAL file is sidelined.

        Show
        Ted Yu added a comment - Patch v5 verifies that unrecognized WAL file is sidelined.
        Ted Yu made changes -
        Attachment 11620-v5.txt [ 12659038 ]
        Hide
        Andrew Purtell added a comment -

        Thanks Ted. The test looks good and it does what you'd expect. The remaining issue here is the SecureProtobufWALReader should be able to read files written by the ProtobufWALWriter. The first thing we do in SecureProtobufWALReader#readHeader is call super.readHeader(), which will fail because we're only checking for a single class name, not a list of valid options. After that change this looks good to go in.

        Please consider extending the unit test a bit to check that the SecureProtobufWALReader can read files written by the ProtobufWALWriter.

        Show
        Andrew Purtell added a comment - Thanks Ted. The test looks good and it does what you'd expect. The remaining issue here is the SecureProtobufWALReader should be able to read files written by the ProtobufWALWriter. The first thing we do in SecureProtobufWALReader#readHeader is call super.readHeader(), which will fail because we're only checking for a single class name, not a list of valid options. After that change this looks good to go in. Please consider extending the unit test a bit to check that the SecureProtobufWALReader can read files written by the ProtobufWALWriter.
        Hide
        Ted Yu added a comment -

        Patch v6 incorporates Andrew's comments.

        There're 2 tests: testHLogReaderOnSecureHLog and testSecureHLogReaderOnHLog

        Show
        Ted Yu added a comment - Patch v6 incorporates Andrew's comments. There're 2 tests: testHLogReaderOnSecureHLog and testSecureHLogReaderOnHLog
        Ted Yu made changes -
        Attachment 11620-v6.txt [ 12659092 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12659092/11620-v6.txt
        against trunk revision .
        ATTACHMENT ID: 12659092

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        +1 site. The mvn site goal succeeds with this patch.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659092/11620-v6.txt against trunk revision . ATTACHMENT ID: 12659092 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10255//console This message is automatically generated.
        Ted Yu made changes -
        Attachment 11620-v6.txt [ 12659152 ]
        Hide
        Ted Yu added a comment -

        Patch v6 applies to branch-1
        I have run test suite for branch-1 on Linux - result looks good.

        Show
        Ted Yu added a comment - Patch v6 applies to branch-1 I have run test suite for branch-1 on Linux - result looks good.
        Ted Yu made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Ted Yu added a comment -

        Patch for 0.98

        Running test suite now.

        Show
        Ted Yu added a comment - Patch for 0.98 Running test suite now.
        Ted Yu made changes -
        Attachment 11620-0.98-v6.txt [ 12659165 ]
        Hide
        Ted Yu added a comment -

        Test suite for 0.98 ran through.

        Ping Enis Soztutar for branch-1

        Show
        Ted Yu added a comment - Test suite for 0.98 ran through. Ping Enis Soztutar for branch-1
        Hide
        Andrew Purtell added a comment -

        +1, patch v6

        Please update or remove the comments in testSecureHLogReaderOnHLog on commit, and fix the assert messages. The meaning of all the checks are reversed, but the text hasn't been updated to reflect that.

        Show
        Andrew Purtell added a comment - +1, patch v6 Please update or remove the comments in testSecureHLogReaderOnHLog on commit, and fix the assert messages. The meaning of all the checks are reversed, but the text hasn't been updated to reflect that.
        Hide
        Ted Yu added a comment -

        Patch v7 corrects wording in comments and assertion.

        Show
        Ted Yu added a comment - Patch v7 corrects wording in comments and assertion.
        Ted Yu made changes -
        Attachment 11620-v7.txt [ 12659195 ]
        Hide
        Ted Yu added a comment -

        Counterpart for 0.98

        Show
        Ted Yu added a comment - Counterpart for 0.98
        Ted Yu made changes -
        Attachment 11620-0.98-v7.txt [ 12659198 ]
        Hide
        Ted Yu added a comment -

        Integrated to 3 branches.

        Will resolve once Jenkins builds come back.

        Thanks for the review, Andrew.

        Show
        Ted Yu added a comment - Integrated to 3 branches. Will resolve once Jenkins builds come back. Thanks for the review, Andrew.
        Ted Yu made changes -
        Hadoop Flags Reviewed [ 10343 ]
        Hide
        Andrew Purtell added a comment -

        Hope you don't mind that I resolved this now Ted. It's not a bad idea waiting for Jenkins but sometimes we forget to go back and close the issue. We can reopen this if there is a problem with a build somewhere.

        Show
        Andrew Purtell added a comment - Hope you don't mind that I resolved this now Ted. It's not a bad idea waiting for Jenkins but sometimes we forget to go back and close the issue. We can reopen this if there is a problem with a build somewhere.
        Andrew Purtell made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #406 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/406/)
        HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev acc5c13f37c7b16058797e81a6ec4769d8335540)

        • hbase-protocol/src/main/protobuf/WAL.proto
        • hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
        • hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #406 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/406/ ) HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev acc5c13f37c7b16058797e81a6ec4769d8335540) hbase-protocol/src/main/protobuf/WAL.proto hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-0.98 #429 (See https://builds.apache.org/job/HBase-0.98/429/)
        HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev acc5c13f37c7b16058797e81a6ec4769d8335540)

        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
        • hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
        • hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
        • hbase-protocol/src/main/protobuf/WAL.proto
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-0.98 #429 (See https://builds.apache.org/job/HBase-0.98/429/ ) HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev acc5c13f37c7b16058797e81a6ec4769d8335540) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java hbase-protocol/src/main/protobuf/WAL.proto
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK #5361 (See https://builds.apache.org/job/HBase-TRUNK/5361/)
        HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev b384c06d35c89642510c097a1afc0228bff774fb)

        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
        • hbase-protocol/src/main/protobuf/WAL.proto
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
        • hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
        • hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #5361 (See https://builds.apache.org/job/HBase-TRUNK/5361/ ) HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev b384c06d35c89642510c097a1afc0228bff774fb) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java hbase-protocol/src/main/protobuf/WAL.proto hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-1.0 #80 (See https://builds.apache.org/job/HBase-1.0/80/)
        HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev e142961099cda5b3f733cd2239cb22ce150f5c08)

        • hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
        • hbase-protocol/src/main/protobuf/WAL.proto
        • hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
        • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-1.0 #80 (See https://builds.apache.org/job/HBase-1.0/80/ ) HBASE-11620 Record the class name of Writer in WAL header so that only proper Reader can open the WAL file (Ted Yu) (tedyu: rev e142961099cda5b3f733cd2239cb22ce150f5c08) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java hbase-protocol/src/main/protobuf/WAL.proto hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
        Hide
        Enis Soztutar added a comment -

        Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. What do you guys think, shall I open an issue?

        Show
        Enis Soztutar added a comment - Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. What do you guys think, shall I open an issue?
        Hide
        Andrew Purtell added a comment -

        Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. What do you guys think, shall I open an issue?

        Sounds good to me

        Show
        Andrew Purtell added a comment - Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. What do you guys think, shall I open an issue? Sounds good to me
        Hide
        ramkrishna.s.vasudevan added a comment -

        Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec

        +1. Would be really helpful.

        Show
        ramkrishna.s.vasudevan added a comment - Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec +1. Would be really helpful.
        Hide
        Ted Yu added a comment -

        I logged HBASE-11762 for writing Codec class name in WAL header.

        Initial patch attached.

        Show
        Ted Yu added a comment - I logged HBASE-11762 for writing Codec class name in WAL header. Initial patch attached.
        Hide
        Enis Soztutar added a comment -

        Closing this issue after 0.99.0 release.

        Show
        Enis Soztutar added a comment - Closing this issue after 0.99.0 release.
        Enis Soztutar made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        21h 58m 1 Ted Yu 31/Jul/14 19:10
        Patch Available Patch Available Open Open
        21h 20m 1 Ted Yu 01/Aug/14 16:30
        Open Open Resolved Resolved
        1h 52m 1 Andrew Purtell 01/Aug/14 18:23
        Resolved Resolved Closed Closed
        204d 6h 7m 1 Enis Soztutar 21/Feb/15 23:30

          People

          • Assignee:
            Ted Yu
            Reporter:
            Ted Yu
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development