HBase
  1. HBase
  2. HBASE-5795

HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.0, 0.95.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This commit broke our 0.92/0.94 compatibility:

      ------------------------------------------------------------------------
      r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
      
      HBASE-3927 display total uncompressed byte size of a region in web UI
      

      I just tried the new RC for 0.94. I brought up a 0.94 master on a 0.92 cluster and rather than just digest version 1 of the HServerLoad, I get this:

      2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to read call parameters for client 10.4.14.38
      java.io.IOException: Error in readFields
              at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
              at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
              at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
              at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
              at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
              at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
              at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:662)
      Caused by: A record version mismatch occured. Expecting v2, found v1
              at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
              at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
              at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
              at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
              ... 9 more
      
      1. 5795.unittest.txt
        23 kB
        stack
      2. 5795-v2.txt
        2 kB
        Ted Yu
      3. 5795-v3.txt
        26 kB
        Ted Yu

        Activity

        Hide
        stack added a comment -

        Hmm... Its not hbase-3927 that broke compatibility, it seems rather to be this one that changes the RegionLoad VERSION:

        ------------------------------------------------------------------------                
        r1238873 | tedyu | 2012-01-31 16:12:36 -0800 (Tue, 31 Jan 2012) | 2 lines             
                                                                                                     
        HBASE-5256 Use WritableUtils.readVInt() in RegionLoad.readFields() (Mubarak) 
        

        Looking at the patch, it breaks compatibility in a pretty radical way changing ints to vints on all RegionLoad members.

        Show
        stack added a comment - Hmm... Its not hbase-3927 that broke compatibility, it seems rather to be this one that changes the RegionLoad VERSION: ------------------------------------------------------------------------ r1238873 | tedyu | 2012-01-31 16:12:36 -0800 (Tue, 31 Jan 2012) | 2 lines HBASE-5256 Use WritableUtils.readVInt() in RegionLoad.readFields() (Mubarak) Looking at the patch, it breaks compatibility in a pretty radical way changing ints to vints on all RegionLoad members.
        Hide
        stack added a comment -

        I'd suggest backing out HBASE-5256. Its a little weird in that it ups the VERSION on the inner class but not on the outer class. Its not a critical fix either so we could probably do w/o it in 0.94. Let me try removing it.

        Show
        stack added a comment - I'd suggest backing out HBASE-5256 . Its a little weird in that it ups the VERSION on the inner class but not on the outer class. Its not a critical fix either so we could probably do w/o it in 0.94. Let me try removing it.
        Hide
        stack added a comment -

        Hmmm... not that easy. This one messes us up too...

        ------------------------------------------------------------------------
        r1239157 | tedyu | 2012-02-01 06:56:20 -0800 (Wed, 01 Feb 2012) | 2 lines
        
        HBASE-5283 Request counters may become negative for heavily loaded regions (Mubarak)
        

        The above commit depends on hbase-5256. If hbase-5256 were not in place, this would not break compatibility but since we have to back out hbase-5256, it does. Looking..

        Show
        stack added a comment - Hmmm... not that easy. This one messes us up too... ------------------------------------------------------------------------ r1239157 | tedyu | 2012-02-01 06:56:20 -0800 (Wed, 01 Feb 2012) | 2 lines HBASE-5283 Request counters may become negative for heavily loaded regions (Mubarak) The above commit depends on hbase-5256. If hbase-5256 were not in place, this would not break compatibility but since we have to back out hbase-5256, it does. Looking..
        Hide
        stack added a comment -

        Unit test that demonstrates the problem. It brings HSL from 0.92 into src/test/java/o.a.h.h and then tries to have a 0.94/trunk HSL deserialize the 0.92 HSL. Currently it fails.

        I'm thinking that we make this work by pressing on.. by including in HSL a HSL092 to use deserializing 092 versions... or rather 092 versions of HSL#RegionLoad. I think we need to press on because the second patch is a legit fix – converting requests from int to long to avoid our ever going negative on read/write counts... I don't think we should revert this in 0.94.

        Show
        stack added a comment - Unit test that demonstrates the problem. It brings HSL from 0.92 into src/test/java/o.a.h.h and then tries to have a 0.94/trunk HSL deserialize the 0.92 HSL. Currently it fails. I'm thinking that we make this work by pressing on.. by including in HSL a HSL092 to use deserializing 092 versions... or rather 092 versions of HSL#RegionLoad. I think we need to press on because the second patch is a legit fix – converting requests from int to long to avoid our ever going negative on read/write counts... I don't think we should revert this in 0.94.
        Hide
        Ted Yu added a comment -

        +1 on introducing HSL92 for backward compatibility.

        Show
        Ted Yu added a comment - +1 on introducing HSL92 for backward compatibility.
        Hide
        Ted Yu added a comment -

        Since only deserialization needs special handling, the attached patch adds a private method to read 0.92 RegionLoad.

        Please comment.

        Show
        Ted Yu added a comment - Since only deserialization needs special handling, the attached patch adds a private method to read 0.92 RegionLoad. Please comment.
        Hide
        Lars Hofhansl added a comment - - edited

        +1 on patch.

        Show
        Lars Hofhansl added a comment - - edited +1 on patch.
        Hide
        Lars Hofhansl added a comment -

        @Stack: I think you forgot to include the actual test in the patch

        Show
        Lars Hofhansl added a comment - @Stack: I think you forgot to include the actual test in the patch
        Hide
        Lars Hofhansl added a comment -

        Oh, you attached your 5794 patch to this issue. I removed to avoid confusing, when you get a chance could you attach your patch here? Then we can make a combined patch with that and Ted's fix.

        Show
        Lars Hofhansl added a comment - Oh, you attached your 5794 patch to this issue. I removed to avoid confusing, when you get a chance could you attach your patch here? Then we can make a combined patch with that and Ted's fix.
        Hide
        stack added a comment -

        unit test

        Show
        stack added a comment - unit test
        Hide
        stack added a comment -

        I looked at Ted's patch. That should do it. See if it makes the unit test pass I'd say. I can test on cluster tomorrow morning (will also finish my rolling restart and kill of meta on a cluster w/ 1k regions too...)

        Show
        stack added a comment - I looked at Ted's patch. That should do it. See if it makes the unit test pass I'd say. I can test on cluster tomorrow morning (will also finish my rolling restart and kill of meta on a cluster w/ 1k regions too...)
        Hide
        Ted Yu added a comment -

        Patch v1 didn't make testHServerLoadVersioning pass.

        Patch v2 does.
        I found that the version of RegionLoad was actually serialized twice in 0.92: first by VersionedWritable.write(), followed by RegionLoad.write().
        In patch v2, I removed the redundant write. readFields92() consumes the second copy of version.

        Show
        Ted Yu added a comment - Patch v1 didn't make testHServerLoadVersioning pass. Patch v2 does. I found that the version of RegionLoad was actually serialized twice in 0.92: first by VersionedWritable.write(), followed by RegionLoad.write(). In patch v2, I removed the redundant write. readFields92() consumes the second copy of version.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522732/5795-v2.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1534//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1534//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1534//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522732/5795-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1534//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1534//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1534//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        @Ted: I don't understand why that redundant write in 0.94 causes any problem. Can you elaborate? Was there an other problem in v1?

        Show
        Lars Hofhansl added a comment - @Ted: I don't understand why that redundant write in 0.94 causes any problem. Can you elaborate? Was there an other problem in v1?
        Hide
        Ted Yu added a comment -

        VersionedWritable.readFields() would detect version mismatch and throw exception.

        Show
        Ted Yu added a comment - VersionedWritable.readFields() would detect version mismatch and throw exception.
        Hide
        stack added a comment -

        v2 works out on a cluster for me

        Show
        stack added a comment - v2 works out on a cluster for me
        Hide
        Ted Yu added a comment -

        Will integrate patch v2 in 4 hours if there is no objection.

        Show
        Ted Yu added a comment - Will integrate patch v2 in 4 hours if there is no objection.
        Hide
        Lars Hofhansl added a comment -

        +1 on v2, are you integrating v2 with Stacks test?

        Show
        Lars Hofhansl added a comment - +1 on v2, are you integrating v2 with Stacks test?
        Hide
        Ted Yu added a comment -

        I am open in this regard.
        Since the 0.92 deserialization code would be stable (RegionLoad format in 0.92 shouldn't change), I wonder if manual verification is enough.

        Show
        Ted Yu added a comment - I am open in this regard. Since the 0.92 deserialization code would be stable (RegionLoad format in 0.92 shouldn't change), I wonder if manual verification is enough.
        Hide
        stack added a comment -

        No. Please include the unit test on commit.

        Show
        stack added a comment - No. Please include the unit test on commit.
        Hide
        Ted Yu added a comment -

        Patch combining v2 and Stack's test.

        Show
        Ted Yu added a comment - Patch combining v2 and Stack's test.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522841/5795-v3.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.replication.TestReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1542//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1542//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1542//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522841/5795-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1542//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1542//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1542//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        TestReplication failure isn't related to the patch.

        Integrated patch v3 to 0.94 and trunk.

        Thanks for finding the bug and providing the test, Stack.

        Thanks for the review Stack and Lars.

        Show
        Ted Yu added a comment - TestReplication failure isn't related to the patch. Integrated patch v3 to 0.94 and trunk. Thanks for finding the bug and providing the test, Stack. Thanks for the review Stack and Lars.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2769 (See https://builds.apache.org/job/HBase-TRUNK/2769/)
        HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326794)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2769 (See https://builds.apache.org/job/HBase-TRUNK/2769/ ) HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326794) Result = SUCCESS tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HServerLoad.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Hide
        Lars Hofhansl added a comment -

        One down.

        Show
        Lars Hofhansl added a comment - One down.
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #120 (See https://builds.apache.org/job/HBase-0.94/120/)
        HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326791)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #120 (See https://builds.apache.org/job/HBase-0.94/120/ ) HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326791) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HServerLoad.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #173 (See https://builds.apache.org/job/HBase-TRUNK-security/173/)
        HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326794)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #173 (See https://builds.apache.org/job/HBase-TRUNK-security/173/ ) HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326794) Result = FAILURE tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HServerLoad.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security #13 (See https://builds.apache.org/job/HBase-0.94-security/13/)
        HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326791)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security #13 (See https://builds.apache.org/job/HBase-0.94-security/13/ ) HBASE-5795 HServerLoad$RegionLoad breaks 0.92<->0.94 compatibility (Revision 1326791) Result = FAILURE tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HServerLoad.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HServerLoad092.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestSerialization.java

          People

          • Assignee:
            Ted Yu
            Reporter:
            stack
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development