Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.94.0
    • Fix Version/s: 0.94.0
    • Component/s: mapreduce
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Parent allows keeping deleted rows around. Would be nice if those could be exported and imported as well.
      All the building blocks are there.

      1. 4682-v2.txt
        13 kB
        Lars Hofhansl
      2. 4682-v1.txt
        6 kB
        Lars Hofhansl

        Activity

        Hide
        Lars Hofhansl added a comment -

        Have a patch for this in place, will test a bit. Change will be simple (all the heavy lifting is in parent).

        Show
        Lars Hofhansl added a comment - Have a patch for this in place, will test a bit. Change will be simple (all the heavy lifting is in parent).
        Hide
        Lars Hofhansl added a comment - - edited

        Straight forward patch.

        I found there's actually a bug in the Scan(Scan) constructor, which did not copy the passed Scan's attributes.

        Will add some tests soon, but want to get a test run in tonight.

        Show
        Lars Hofhansl added a comment - - edited Straight forward patch. I found there's actually a bug in the Scan(Scan) constructor, which did not copy the passed Scan's attributes. Will add some tests soon, but want to get a test run in tonight.
        Hide
        Ted Yu added a comment -

        In the new Delete ctor, row key should be included in the IOE.

        Show
        Ted Yu added a comment - In the new Delete ctor, row key should be included in the IOE.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12506578/4682-v1.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 javadoc. The javadoc tool appears to have generated -160 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 74 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/474//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/474//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/474//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506578/4682-v1.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 74 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/474//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/474//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/474//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        @Ted: Will add that. Also instead of ignoring TableInputFormat.SCAN_COLUMN_FAMILY when keep deleted cells is specified, I should probably just let is pass, and have it fail with a IOException soon after.

        Show
        Lars Hofhansl added a comment - @Ted: Will add that. Also instead of ignoring TableInputFormat.SCAN_COLUMN_FAMILY when keep deleted cells is specified, I should probably just let is pass, and have it fail with a IOException soon after.
        Hide
        Lars Hofhansl added a comment -

        I did some more testing with this and everything works fine.
        Apparently there are no tests for import/export anywhere, so I am adding a test for existing functionality as well.

        Show
        Lars Hofhansl added a comment - I did some more testing with this and everything works fine. Apparently there are no tests for import/export anywhere, so I am adding a test for existing functionality as well.
        Hide
        Lars Hofhansl added a comment -

        Addressed Ted's comment.
        Added an Import/Export test (there were NO tests for Import/Export before)

        Show
        Lars Hofhansl added a comment - Addressed Ted's comment. Added an Import/Export test (there were NO tests for Import/Export before)
        Hide
        stack added a comment -

        Nice test. For sure its medium sized? Takes < 50 seconds?

        This bit seems a little messy:

        -        put.add(kv);
        +        if (kv.isDelete()) {
        +          context.write(key, new Delete(kv));
        +        } else {
        +          if (put == null) { 
        +            put = new Put(key.get());
        +          }
        +          put.add(kv);
        +        }
               }
        -      return put;
        +      if (put != null) {
        +        context.write(key, put);
        +      }
        

        We write the context if its a delete but if a put, we make a put instance and then do the context.write later... can they not both do context.write in same place or share the context.write invocation (if context.write takes a Mutation).

        Show
        stack added a comment - Nice test. For sure its medium sized? Takes < 50 seconds? This bit seems a little messy: - put.add(kv); + if (kv.isDelete()) { + context.write(key, new Delete(kv)); + } else { + if (put == null ) { + put = new Put(key.get()); + } + put.add(kv); + } } - return put; + if (put != null ) { + context.write(key, put); + } We write the context if its a delete but if a put, we make a put instance and then do the context.write later... can they not both do context.write in same place or share the context.write invocation (if context.write takes a Mutation).
        Hide
        Lars Hofhansl added a comment -

        Heh... It takes 46 secs on my machine. Can make it large, too.

        As for the messy-ness... The problem is that Delete has strange rules as what you can do in a single Delete (when you add a deleteFamily all prior deleteColumn(s) are removed, etc). So deletes have to be written one-by-one to be correct (that is also why I added the Delete(kv) constructor, rather than an delete(kv) method).

        A put, on the other hand, allows adding all put-related KVs of the same row into a single Put object. That's why delete-kvs are issued immediately, and put-kvs are collected into a single Put (that's also what the existing code does).
        It is possible that particular row only has delete markers, so the Put is created when needed, and only written to the HBase when there were any put-kvs.

        Show
        Lars Hofhansl added a comment - Heh... It takes 46 secs on my machine. Can make it large, too. As for the messy-ness... The problem is that Delete has strange rules as what you can do in a single Delete (when you add a deleteFamily all prior deleteColumn(s) are removed, etc). So deletes have to be written one-by-one to be correct (that is also why I added the Delete(kv) constructor, rather than an delete(kv) method). A put, on the other hand, allows adding all put-related KVs of the same row into a single Put object. That's why delete-kvs are issued immediately, and put-kvs are collected into a single Put (that's also what the existing code does). It is possible that particular row only has delete markers, so the Put is created when needed, and only written to the HBase when there were any put-kvs.
        Hide
        stack added a comment -

        Thanks for explaination. +1 on commit.

        Show
        stack added a comment - Thanks for explaination. +1 on commit.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12506658/4682-v2.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -160 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 74 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/476//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/476//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/476//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506658/4682-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 74 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/476//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/476//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/476//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        Committed to trunk

        Show
        Lars Hofhansl added a comment - Committed to trunk
        Hide
        Lars Hofhansl added a comment -

        Thanks for the review stack (I added comments to the pieces in Import)

        Show
        Lars Hofhansl added a comment - Thanks for the review stack (I added comments to the pieces in Import)
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2532 (See https://builds.apache.org/job/HBase-TRUNK/2532/)
        HBASE-4682 Support deleted rows using Import/Export (Lars H)

        larsh :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Export.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2532 (See https://builds.apache.org/job/HBase-TRUNK/2532/ ) HBASE-4682 Support deleted rows using Import/Export (Lars H) larsh : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Export.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #26 (See https://builds.apache.org/job/HBase-TRUNK-security/26/)
        HBASE-4682 Support deleted rows using Import/Export (Lars H)

        larsh :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Export.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #26 (See https://builds.apache.org/job/HBase-TRUNK-security/26/ ) HBASE-4682 Support deleted rows using Import/Export (Lars H) larsh : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Export.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java

          People

          • Assignee:
            Lars Hofhansl
            Reporter:
            Lars Hofhansl
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development