HBase
  1. HBase
  2. HBASE-8849

CellCodec should write and read the memstoreTS/mvccVersion

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0
    • Component/s: None
    • Labels:
    • Hadoop Flags:
      Reviewed

      Description

      This JIRA is wrt discussion over in HBASE-8496.
      Cell interface and the corresponding CellCodec provides a new way of serializing the Keyvalues.
      Cell interface supports getMvccVersion() and the javadoc says it could be > 0 if it exists or 0 otherwise.
      But we don't tend to write/read the memstoreTS/mvccVersion in the Cell codec.

      1. HBASE-8849_1.patch
        22 kB
        ramkrishna.s.vasudevan
      2. HBASE-8849_2.patch
        20 kB
        ramkrishna.s.vasudevan
      3. HBASE-8849_3.patch
        3 kB
        ramkrishna.s.vasudevan

        Activity

        Hide
        ramkrishna.s.vasudevan added a comment -

        Should memstoreTS be part of the Cell implementation? Or it should continue the current way where we would write the memstoreTS after the each cell?
        In that case when we try to read the Cell we should have a provision to set the memStoreTS to the cell? Till the underlying format of cell remains KeyValue this may not be needed as KeyValue will have the setter for the memstoreTS.
        Pls provide your suggestion on this.

        Show
        ramkrishna.s.vasudevan added a comment - Should memstoreTS be part of the Cell implementation? Or it should continue the current way where we would write the memstoreTS after the each cell? In that case when we try to read the Cell we should have a provision to set the memStoreTS to the cell? Till the underlying format of cell remains KeyValue this may not be needed as KeyValue will have the setter for the memstoreTS. Pls provide your suggestion on this.
        Hide
        stack added a comment -

        How each codec serializes is up to the code i'd say Ram. If present, codec could serialize the mvcc or not (probably should). Ditto on deserialize. If you want to change the cell codec to include mvcc, go ahead; cellcodec was just a hackup for testing purposes.

        Show
        stack added a comment - How each codec serializes is up to the code i'd say Ram. If present, codec could serialize the mvcc or not (probably should). Ditto on deserialize. If you want to change the cell codec to include mvcc, go ahead; cellcodec was just a hackup for testing purposes.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Patch that allows persisting the memstoreTS as Vlong. Added the encodememstoreTS and decodememstoreTS flags.
        If the codec used is CellCodec for HFileWriterV3 then we may or may not write/read the memstoreTS based on the HFile fileinfo metadata.

        Show
        ramkrishna.s.vasudevan added a comment - Patch that allows persisting the memstoreTS as Vlong. Added the encodememstoreTS and decodememstoreTS flags. If the codec used is CellCodec for HFileWriterV3 then we may or may not write/read the memstoreTS based on the HFile fileinfo metadata.
        Hide
        stack added a comment -

        Just say not to Writables... even WritableUtils.

        The vint done in WritableUtils is bad. See Benoit on this: https://github.com/OpenTSDB/asynchbase/blob/master/src/HBaseRpc.java#L849

        Just do a long for now I'd say ramkrishna.s.vasudevan We can come back later and make a fancier encoding.... should probably do some prefix encoding in here anyways – why not... Can d later.

        Show
        stack added a comment - Just say not to Writables... even WritableUtils. The vint done in WritableUtils is bad. See Benoit on this: https://github.com/OpenTSDB/asynchbase/blob/master/src/HBaseRpc.java#L849 Just do a long for now I'd say ramkrishna.s.vasudevan We can come back later and make a fancier encoding.... should probably do some prefix encoding in here anyways – why not... Can d later.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Just say not to Writables... even WritableUtils.

        Ok will change and update the patch. Will keep it simple for now.

        Show
        ramkrishna.s.vasudevan added a comment - Just say not to Writables... even WritableUtils. Ok will change and update the patch. Will keep it simple for now.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Updated patch.

        Show
        ramkrishna.s.vasudevan added a comment - Updated patch.
        Hide
        stack added a comment -

        You are going to get yourself in to trouble here Ram if you do this:

        + CellEncoder(final OutputStream out, boolean encodeMemstoreTS) {

        ...giving folks option of encoding mvcc (should be called mvcc not memstorets).

        What happens if the flag was set encoding but not on decoding? It'll blow up?

        The style should be more MvccCellCodec for the codec that includes mvcc and another NoMvccCellCodec for the codec that does not. For now I'd say just under the wraps have CellCodec encode the mvcc.

        Show
        stack added a comment - You are going to get yourself in to trouble here Ram if you do this: + CellEncoder(final OutputStream out, boolean encodeMemstoreTS) { ...giving folks option of encoding mvcc (should be called mvcc not memstorets). What happens if the flag was set encoding but not on decoding? It'll blow up? The style should be more MvccCellCodec for the codec that includes mvcc and another NoMvccCellCodec for the codec that does not. For now I'd say just under the wraps have CellCodec encode the mvcc.
        Hide
        ramkrishna.s.vasudevan added a comment -

        The style should be more MvccCellCodec for the codec that includes mvcc and another NoMvccCellCodec for the codec that does not. For now I'd say just under the wraps have CellCodec encode the mvcc.

        I agree with you on this. Infact individual codecs should take care of handling mvcc. But how should the read path be in my HFileReader? Should i have 2 paths where if a codec that does not handle mvcc with in itself should be handled specifically and the codec that handles mvcc within itself should be handled specifically? This will make things complex.
        With this codec based approach we should stick to a strategy wrt mvcc otherwise the Read path (infact even write path) will have to handle them. If we are ok on this then i can make a base class saying MvccCellCodec and NoMvccCellCodec - so any new CellCodec would be inheriting either one of them and in the read/write path we need to switch based on what codec is used?
        Does this make sense?

        Show
        ramkrishna.s.vasudevan added a comment - The style should be more MvccCellCodec for the codec that includes mvcc and another NoMvccCellCodec for the codec that does not. For now I'd say just under the wraps have CellCodec encode the mvcc. I agree with you on this. Infact individual codecs should take care of handling mvcc. But how should the read path be in my HFileReader? Should i have 2 paths where if a codec that does not handle mvcc with in itself should be handled specifically and the codec that handles mvcc within itself should be handled specifically? This will make things complex. With this codec based approach we should stick to a strategy wrt mvcc otherwise the Read path (infact even write path) will have to handle them. If we are ok on this then i can make a base class saying MvccCellCodec and NoMvccCellCodec - so any new CellCodec would be inheriting either one of them and in the read/write path we need to switch based on what codec is used? Does this make sense?
        Hide
        ramkrishna.s.vasudevan added a comment -

        Ideally when we start using the codec in the HFiles, the code for reading the KeyValue should be something like below

        private final void readKeyValueLen() {
              try {
                // TODO : Can we specify the max value here
                decoder.mark();
                decoder.advance();
                currKV = decoder.current();
                decoder.reset();
              } catch (IOException e) {
                throw new RuntimeException("Error while reading the keyvalue len", e);
              }
            }
        

        The above code reads the current cell and resets the position in the byte buffer. Now if i need to handle mvcc, after this we have to read the mvcc explicitly. Pls provide your opinion on this.

        Show
        ramkrishna.s.vasudevan added a comment - Ideally when we start using the codec in the HFiles, the code for reading the KeyValue should be something like below private final void readKeyValueLen() { try { // TODO : Can we specify the max value here decoder.mark(); decoder.advance(); currKV = decoder.current(); decoder.reset(); } catch (IOException e) { throw new RuntimeException( "Error while reading the keyvalue len" , e); } } The above code reads the current cell and resets the position in the byte buffer. Now if i need to handle mvcc, after this we have to read the mvcc explicitly. Pls provide your opinion on this.
        Hide
        stack added a comment -

        Here is a suggestion. I'd think that an hfile would have blocks all of the same codec? The codec used writing would be metadata on the hfile. You'd read what codec to use before you started pulling in blocks. The file would be wrriten w/ a codec that included mvcc or it could be written with a codec that did not include mvcc?

        Show
        stack added a comment - Here is a suggestion. I'd think that an hfile would have blocks all of the same codec? The codec used writing would be metadata on the hfile. You'd read what codec to use before you started pulling in blocks. The file would be wrriten w/ a codec that included mvcc or it could be written with a codec that did not include mvcc?
        Hide
        ramkrishna.s.vasudevan added a comment -

        The codec used writing would be metadata on the hfile. You'd read what codec to use before you started pulling in blocks

        Yes true. We will be able to do this, should not be a problem.

        The file would be wrriten w/ a codec that included mvcc or it could be written with a codec that did not include mvcc?

        Hmmm.. Yes.. We can have this. But the read and write flow should be able to know about this and the framework that will be using the codec should take the decision of how to handle the mvcc. Am i making it clear?
        So the framework should be intelligent enough to know this and work based on that. Which i thought would make the read/write path to have additional code incase mvcc is not handled inside the codec.

        Show
        ramkrishna.s.vasudevan added a comment - The codec used writing would be metadata on the hfile. You'd read what codec to use before you started pulling in blocks Yes true. We will be able to do this, should not be a problem. The file would be wrriten w/ a codec that included mvcc or it could be written with a codec that did not include mvcc? Hmmm.. Yes.. We can have this. But the read and write flow should be able to know about this and the framework that will be using the codec should take the decision of how to handle the mvcc. Am i making it clear? So the framework should be intelligent enough to know this and work based on that. Which i thought would make the read/write path to have additional code incase mvcc is not handled inside the codec.
        Hide
        Anoop Sam John added a comment -

        Handling the mvcc, I would say we need handle that fully in the Codec Encoder/Decoder. Let the encoder decide what to do with the mvcc. Might not be good to pass it as a boolean arg or so.

        Show
        Anoop Sam John added a comment - Handling the mvcc, I would say we need handle that fully in the Codec Encoder/Decoder. Let the encoder decide what to do with the mvcc. Might not be good to pass it as a boolean arg or so.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Simple patch. Also writes mvcc. Its better we write mvcc.
        @Stack
        Pls provide your comments.

        Show
        ramkrishna.s.vasudevan added a comment - Simple patch. Also writes mvcc. Its better we write mvcc. @Stack Pls provide your comments.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12593016/HBASE-8849_3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        +1 site. The mvn site goal succeeds with this patch.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12593016/HBASE-8849_3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6401//console This message is automatically generated.
        Hide
        stack added a comment -

        Should CellUtil let you add tags too? That can be in new patch. +1 on this one.

        Show
        stack added a comment - Should CellUtil let you add tags too? That can be in new patch. +1 on this one.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Committed to trunk i.e. 0.98.

        Show
        ramkrishna.s.vasudevan added a comment - Committed to trunk i.e. 0.98.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Thanks for the review Stack.

        Show
        ramkrishna.s.vasudevan added a comment - Thanks for the review Stack.
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK #4348 (See https://builds.apache.org/job/HBase-TRUNK/4348/)
        HBASE-8849 - CellCodec should write and read the memstoreTS/mvccVersion (Ram) (ramkrishna: rev 1511043)

        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/codec/CellCodec.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/codec/TestCellCodec.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4348 (See https://builds.apache.org/job/HBase-TRUNK/4348/ ) HBASE-8849 - CellCodec should write and read the memstoreTS/mvccVersion (Ram) (ramkrishna: rev 1511043) /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/codec/CellCodec.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/codec/TestCellCodec.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #655 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/655/)
        HBASE-8849 - CellCodec should write and read the memstoreTS/mvccVersion (Ram) (ramkrishna: rev 1511043)

        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/codec/CellCodec.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/codec/TestCellCodec.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #655 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/655/ ) HBASE-8849 - CellCodec should write and read the memstoreTS/mvccVersion (Ram) (ramkrishna: rev 1511043) /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/codec/CellCodec.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/codec/TestCellCodec.java

          People

          • Assignee:
            ramkrishna.s.vasudevan
            Reporter:
            ramkrishna.s.vasudevan
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development