HBase
  1. HBase
  2. HBASE-7531

[replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.5, 0.95.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Here's a NPE I get half the time I run TestReplication:

      2012-12-20 08:59:17,259 ERROR [RegionServer:1;192.168.10.135,49168,1356011734418-EventThread.replicationSource,2] regionserver.ReplicationSource$1(727): Unexpected exception in ReplicationSource, currentPath=hdfs://localhost:65533/user/jdcryans/hbase/.logs/192.168.10.135,49168,1356011734418/192.168.10.135%2C49168%2C1356011734418.1356011956626
      java.lang.NullPointerException
              at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.seek(SequenceFileLogReader.java:261)
              at org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.seek(ReplicationHLogReaderManager.java:103)
              at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:414)
              at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:332)
      

      The issue happens after an IOE was caught while opening the reader, the issue is that it isn't set to null after that then the rest of the code assumes the reader is usable.

      1. HBASE-7531.patch
        0.7 kB
        Jean-Daniel Cryans

        Activity

        Hide
        Jean-Daniel Cryans added a comment -

        Just a simple fix, setting the reader to null if we couldn't get it.

        Show
        Jean-Daniel Cryans added a comment - Just a simple fix, setting the reader to null if we couldn't get it.
        Hide
        Sergey Shelukhin added a comment -

        +1. The cause is the dubious semantics of openReader imho (but I may just be unfamiliar with code); sleepMultiplier decision can be in the outside loop and openReader return value meaning can then be simpler.

        Show
        Sergey Shelukhin added a comment - +1. The cause is the dubious semantics of openReader imho (but I may just be unfamiliar with code); sleepMultiplier decision can be in the outside loop and openReader return value meaning can then be simpler.
        Hide
        Jean-Daniel Cryans added a comment -

        I was able to find one test failure caused by this:

        https://builds.apache.org/job/HBase-0.94/656/testReport/org.apache.hadoop.hbase.replication/TestReplicationWithCompression/testVerifyRepJob/

        The replication thread dies so truncating can't complete.

        The cause is the dubious semantics of openReader imho

        Yeah I should probably fold in that reader somehow into ReplicationHLogReaderManager.

        Show
        Jean-Daniel Cryans added a comment - I was able to find one test failure caused by this: https://builds.apache.org/job/HBase-0.94/656/testReport/org.apache.hadoop.hbase.replication/TestReplicationWithCompression/testVerifyRepJob/ The replication thread dies so truncating can't complete. The cause is the dubious semantics of openReader imho Yeah I should probably fold in that reader somehow into ReplicationHLogReaderManager.
        Hide
        stack added a comment -

        +1

        Show
        stack added a comment - +1
        Hide
        Jean-Daniel Cryans added a comment -

        Committed to trunk and 0.94

        Show
        Jean-Daniel Cryans added a comment - Committed to trunk and 0.94
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #3726 (See https://builds.apache.org/job/HBase-TRUNK/3726/)
        HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs
        HBASE-7531 [replication] NPE in SequenceFileLogReader because
        ReplicationSource doesn't nullify the reader
        HBASE-7534 [replication] TestReplication.queueFailover can fail
        because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431768)

        Result = FAILURE
        jdcryans :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #3726 (See https://builds.apache.org/job/HBase-TRUNK/3726/ ) HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs HBASE-7531 [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader HBASE-7534 [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431768) Result = FAILURE jdcryans : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #342 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/342/)
        HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs
        HBASE-7531 [replication] NPE in SequenceFileLogReader because
        ReplicationSource doesn't nullify the reader
        HBASE-7534 [replication] TestReplication.queueFailover can fail
        because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431768)

        Result = FAILURE
        jdcryans :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #342 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/342/ ) HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs HBASE-7531 [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader HBASE-7534 [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431768) Result = FAILURE jdcryans : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #722 (See https://builds.apache.org/job/HBase-0.94/722/)
        HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs
        HBASE-7531 [replication] NPE in SequenceFileLogReader because
        ReplicationSource doesn't nullify the reader
        HBASE-7534 [replication] TestReplication.queueFailover can fail
        because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431769)

        Result = SUCCESS
        jdcryans :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #722 (See https://builds.apache.org/job/HBase-0.94/722/ ) HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs HBASE-7531 [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader HBASE-7534 [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431769) Result = SUCCESS jdcryans : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security #95 (See https://builds.apache.org/job/HBase-0.94-security/95/)
        HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs
        HBASE-7531 [replication] NPE in SequenceFileLogReader because
        ReplicationSource doesn't nullify the reader
        HBASE-7534 [replication] TestReplication.queueFailover can fail
        because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431769)

        Result = SUCCESS
        jdcryans :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security #95 (See https://builds.apache.org/job/HBase-0.94-security/95/ ) HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs HBASE-7531 [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader HBASE-7534 [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431769) Result = SUCCESS jdcryans : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security-on-Hadoop-23 #11 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/11/)
        HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs
        HBASE-7531 [replication] NPE in SequenceFileLogReader because
        ReplicationSource doesn't nullify the reader
        HBASE-7534 [replication] TestReplication.queueFailover can fail
        because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431769)

        Result = FAILURE
        jdcryans :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #11 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/11/ ) HBASE-7530 [replication] Work around HDFS-4380 else we get NPEs HBASE-7531 [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader HBASE-7534 [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous (Revision 1431769) Result = FAILURE jdcryans : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java

          People

          • Assignee:
            Jean-Daniel Cryans
            Reporter:
            Jean-Daniel Cryans
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development