HBase
  1. HBase
  2. HBASE-3996

Support multiple tables and scanners as input to the mapper in map/reduce jobs

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.5, 0.95.0
    • Component/s: mapreduce
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Adds MultiTableInputFormat.

      Usage example:

      {code}
      Scan scan1 = new Scan();
      scan1.setStartRow(start1);
      scan1.setStopRow(end1);
      Scan scan2 = new Scan();
      scan2.setStartRow(start2);
      scan2.setStopRow(end2);
      MultiTableInputCollection mtic = new MultiTableInputCollection();
      mtic.Add(tableName1, scan1);
      mtic.Add(tableName2, scan2);
      TableMapReduceUtil.initTableMapperJob(mtic, TestTableMapper.class, Text.class, IntWritable.class, job1);
      {code}
      Show
      Adds MultiTableInputFormat. Usage example: {code} Scan scan1 = new Scan(); scan1.setStartRow(start1); scan1.setStopRow(end1); Scan scan2 = new Scan(); scan2.setStartRow(start2); scan2.setStopRow(end2); MultiTableInputCollection mtic = new MultiTableInputCollection(); mtic.Add(tableName1, scan1); mtic.Add(tableName2, scan2); TableMapReduceUtil.initTableMapperJob(mtic, TestTableMapper.class, Text.class, IntWritable.class, job1); {code}

      Description

      It seems that in many cases feeding data from multiple tables or multiple scanners on a single table can save a lot of time when running map/reduce jobs.
      I propose a new MultiTableInputFormat class that would allow doing this.

      1. HBase-3996.patch
        66 kB
        Eran Kutner
      2. 3996-v2.txt
        43 kB
        Ted Yu
      3. 3996-v3.txt
        42 kB
        Ted Yu
      4. 3996-v4.txt
        40 kB
        Ted Yu
      5. 3996-v5.txt
        40 kB
        Eran Kutner
      6. 3996-v6.txt
        40 kB
        Ted Yu
      7. 3996-v7.txt
        42 kB
        Ted Yu
      8. 3996-v8.txt
        32 kB
        Bryan Baugher
      9. 3996-v9.txt
        32 kB
        Bryan Baugher
      10. 3996-v10.txt
        33 kB
        Bryan Baugher
      11. 3996-v11.txt
        33 kB
        Bryan Baugher
      12. 3996-v12.txt
        33 kB
        Bryan Baugher
      13. 3996-v13.txt
        33 kB
        Bryan Baugher
      14. 3996-v14.txt
        33 kB
        Bryan Baugher
      15. 3996-0.94.txt
        33 kB
        Lars Hofhansl
      16. 3996-v15.txt
        34 kB
        Lars Hofhansl

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security-on-Hadoop-23 #11 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/11/)
          HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441709)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #11 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/11/ ) HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441709) Result = FAILURE larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #390 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/390/)
          HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441708)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #390 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/390/ ) HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441708) Result = FAILURE larsh : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #822 (See https://builds.apache.org/job/HBase-0.94/822/)
          HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441709)

          Result = SUCCESS
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #822 (See https://builds.apache.org/job/HBase-0.94/822/ ) HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441709) Result = SUCCESS larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3846 (See https://builds.apache.org/job/HBase-TRUNK/3846/)
          HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441708)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3846 (See https://builds.apache.org/job/HBase-TRUNK/3846/ ) HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441708) Result = FAILURE larsh : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12567721/3996-v15.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12567721/3996-v15.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4304//console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security #106 (See https://builds.apache.org/job/HBase-0.94-security/106/)
          HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441709)

          Result = SUCCESS
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security #106 (See https://builds.apache.org/job/HBase-0.94-security/106/ ) HBASE-3996 Support multiple tables and scanners as input to the mapper in map/reduce jobs (Eran Kutner, Bryan Baugher) (Revision 1441709) Result = SUCCESS larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
          Hide
          Lars Hofhansl added a comment -

          After 18 months, this finally is committed to 0.94 and 0.96.
          Thanks for the patch Eran and Bryan, and thanks for the persistence.

          Show
          Lars Hofhansl added a comment - After 18 months, this finally is committed to 0.94 and 0.96. Thanks for the patch Eran and Bryan, and thanks for the persistence.
          Hide
          Lars Hofhansl added a comment -

          What I really will commit (includes the new files)

          Show
          Lars Hofhansl added a comment - What I really will commit (includes the new files)
          Hide
          Lars Hofhansl added a comment -

          What I am going to commit.
          v15 fixes an issue with the logger in TableSplit.java and a 0.94 version.

          Show
          Lars Hofhansl added a comment - What I am going to commit. v15 fixes an issue with the logger in TableSplit.java and a 0.94 version.
          Hide
          Lars Hofhansl added a comment -

          Thanks Stack. Will commit in a few.

          Show
          Lars Hofhansl added a comment - Thanks Stack. Will commit in a few.
          Hide
          stack added a comment -

          Lars Hofhansl Fine by me.

          Show
          stack added a comment - Lars Hofhansl Fine by me.
          Hide
          Lars Hofhansl added a comment -

          last bigger issue holding up 0.94.5RC

          Show
          Lars Hofhansl added a comment - last bigger issue holding up 0.94.5RC
          Hide
          Lars Hofhansl added a comment - - edited

          Stack, you're on board with that?

          Show
          Lars Hofhansl added a comment - - edited Stack , you're on board with that?
          Hide
          Ted Yu added a comment -

          I'd say we commit this.

          +1

          Show
          Ted Yu added a comment - I'd say we commit this. +1
          Hide
          Lars Hofhansl added a comment -

          OK... I won't be able to.

          I'd say we commit this.
          It is mostly new code and it is widely requested feature. If we find bugs we weed them out as we find them. Thoughts?
          (We'll be using/testing this at Salesforce as we go further down the backup/restore path for multiple tables)

          Show
          Lars Hofhansl added a comment - OK... I won't be able to. I'd say we commit this. It is mostly new code and it is widely requested feature. If we find bugs we weed them out as we find them. Thoughts? (We'll be using/testing this at Salesforce as we go further down the backup/restore path for multiple tables)
          Hide
          Lars Hofhansl added a comment -

          I might be able to deploy this on a test cluster to try tomorrow.

          Show
          Lars Hofhansl added a comment - I might be able to deploy this on a test cluster to try tomorrow.
          Hide
          Bryan Baugher added a comment -

          No I have not. We primarily run CDH4 here (currently at 92.1).

          Show
          Bryan Baugher added a comment - No I have not. We primarily run CDH4 here (currently at 92.1).
          Hide
          stack added a comment -

          Sorry.

          TableSplit is still a Writable even in trunk. That we'll need to fix (outside scope of this patch though)

          I took a quick look over the patch. It looks good to me. Wholesome stuff. Bryan Baugher Have you used it outside of the unit test? Does it work for you?

          If so, I'm +1.

          Show
          stack added a comment - Sorry. TableSplit is still a Writable even in trunk. That we'll need to fix (outside scope of this patch though) I took a quick look over the patch. It looks good to me. Wholesome stuff. Bryan Baugher Have you used it outside of the unit test? Does it work for you? If so, I'm +1.
          Hide
          Lars Hofhansl added a comment -

          Stack Ping

          Show
          Lars Hofhansl added a comment - Stack Ping
          Hide
          Lars Hofhansl added a comment -

          Seems this is good to go. Stack, you had some concern initially, have these all been addressed?

          Show
          Lars Hofhansl added a comment - Seems this is good to go. Stack , you had some concern initially, have these all been addressed?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12566013/3996-v14.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566013/3996-v14.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4133//console This message is automatically generated.
          Hide
          Bryan Baugher added a comment -

          Fixed findbug error in MultiTableInputFormatBase

          Show
          Bryan Baugher added a comment - Fixed findbug error in MultiTableInputFormatBase
          Hide
          Ted Yu added a comment -

          If you search https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.xml for MultiTableInputFormatBase, you would see the following:

          <BugInstance type="DMI_INVOKING_TOSTRING_ON_ARRAY" priority="2" abbrev="USELESS_STRING" category="CORRECTNESS">
          <Class classname="org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase">
          <SourceLine classname="org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase" start="47" end="214" sourcefile="MultiTableInputFormatBase.java" sourcepath="org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java"/>
          </Class>

          Please address the above warning.

          Show
          Ted Yu added a comment - If you search https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.xml for MultiTableInputFormatBase, you would see the following: <BugInstance type="DMI_INVOKING_TOSTRING_ON_ARRAY" priority="2" abbrev="USELESS_STRING" category="CORRECTNESS"> <Class classname="org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase"> <SourceLine classname="org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase" start="47" end="214" sourcefile="MultiTableInputFormatBase.java" sourcepath="org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java"/> </Class> Please address the above warning.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12565984/3996-v13.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12565984/3996-v13.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestLocalHBaseCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4126//console This message is automatically generated.
          Hide
          Bryan Baugher added a comment -

          Attached latest patch address review comments

          Show
          Bryan Baugher added a comment - Attached latest patch address review comments
          Hide
          Ted Yu added a comment -

          @Bryan:
          Can you attach latest patch to this issue ?

          Thanks

          Show
          Ted Yu added a comment - @Bryan: Can you attach latest patch to this issue ? Thanks
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12565806/3996-v12.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12565806/3996-v12.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4108//console This message is automatically generated.
          Hide
          Bryan Baugher added a comment -
          Show
          Bryan Baugher added a comment - Done, https://reviews.apache.org/r/9042/diff/
          Hide
          Ted Yu added a comment -

          @Bryan:
          Can you upload patch to review board ?

          Thanks

          Show
          Ted Yu added a comment - @Bryan: Can you upload patch to review board ? Thanks
          Hide
          Bryan Baugher added a comment -

          I believe there are two possible questions left unanswered as well as some +1's still needed,

          • The changes to TableSplit would not allow a new version of it to be deserialized by an old server. Is that OK for a M/R job?
          • It has been mentioned to scope this to scans (of a single table) rather then multiple tables.
          Show
          Bryan Baugher added a comment - I believe there are two possible questions left unanswered as well as some +1's still needed, The changes to TableSplit would not allow a new version of it to be deserialized by an old server. Is that OK for a M/R job? It has been mentioned to scope this to scans (of a single table) rather then multiple tables.
          Hide
          Bryan Baugher added a comment -

          Update to latest trunk which had a conflict

          Show
          Bryan Baugher added a comment - Update to latest trunk which had a conflict
          Hide
          stack added a comment -

          Marking critical so gets review. This is popular request. Lets try get it in.

          Show
          stack added a comment - Marking critical so gets review. This is popular request. Lets try get it in.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.master.TestMasterFailover

          -1 core zombie tests. There are 6 zombie test(s):

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.master.TestMasterFailover -1 core zombie tests . There are 6 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3909//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:

          -1 core zombie tests. There are 7 zombie test(s): at org.apache.hadoop.hbase.catalog.TestCatalogTracker.testServerNotRunningIOException(TestCatalogTracker.java:250)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: -1 core zombie tests . There are 7 zombie test(s): at org.apache.hadoop.hbase.catalog.TestCatalogTracker.testServerNotRunningIOException(TestCatalogTracker.java:250) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3906//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3902//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplicationWithCompression
          org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
          org.apache.hadoop.hbase.replication.TestReplication

          -1 core zombie tests. There are 6 zombie test(s):

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563581/3996-v11.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationWithCompression org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort org.apache.hadoop.hbase.replication.TestReplication -1 core zombie tests . There are 6 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3901//console This message is automatically generated.
          Hide
          Bryan Baugher added a comment -

          Fixed test error, line length and added stability annotation.

          I was finally able to get the test to run so I may try to clean up / add more tests in the meantime.

          Show
          Bryan Baugher added a comment - Fixed test error, line length and added stability annotation. I was finally able to get the test to run so I may try to clean up / add more tests in the meantime.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563366/3996-v10.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563366/3996-v10.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3867//console This message is automatically generated.
          Hide
          Ted Yu added a comment -
          +@InterfaceAudience.Public
          +public class MultiTableInputFormat extends MultiTableInputFormatBase implements
          

          When audience is public, please add stability annotation.
          w.r.t. enum Version, previously only HLogKey uses the enum. I think the enum would not be in sync between HLogKey and TableSplit. See the following from HLogKey:

              COMPRESSED(-2);
          

          So we can keep separate enum's for now.

          Running TestMultiTableInputFormat, I saw several test failures.

            <testcase time="112.065" classname="org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat" name="testScanEmptyToEmpty">
              <failure type="java.lang.AssertionError">java.lang.AssertionError
            at org.junit.Assert.fail(Assert.java:92)
            at org.junit.Assert.assertTrue(Assert.java:43)
            at org.junit.Assert.assertTrue(Assert.java:54)
            at org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat.testScan(TestMultiTableInputFormat.java:252)
            at org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat.testScanEmptyToEmpty(TestMultiTableInputFormat.java:177)
          

          TestTableInputFormat passed locally.

          Here is OS info:

          Darwin TYus-MacBook-Pro.local 12.2.1 Darwin Kernel Version 12.2.1: Thu Oct 18 12:13:47 PDT 2012; root:xnu-2050.20.9~1/RELEASE_X86_64 x86_64

          Show
          Ted Yu added a comment - +@InterfaceAudience.Public + public class MultiTableInputFormat extends MultiTableInputFormatBase implements When audience is public, please add stability annotation. w.r.t. enum Version, previously only HLogKey uses the enum. I think the enum would not be in sync between HLogKey and TableSplit. See the following from HLogKey: COMPRESSED(-2); So we can keep separate enum's for now. Running TestMultiTableInputFormat, I saw several test failures. <testcase time= "112.065" classname= "org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat" name= "testScanEmptyToEmpty" > <failure type= "java.lang.AssertionError" >java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat.testScan(TestMultiTableInputFormat.java:252) at org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat.testScanEmptyToEmpty(TestMultiTableInputFormat.java:177) TestTableInputFormat passed locally. Here is OS info: Darwin TYus-MacBook-Pro.local 12.2.1 Darwin Kernel Version 12.2.1: Thu Oct 18 12:13:47 PDT 2012; root:xnu-2050.20.9~1/RELEASE_X86_64 x86_64
          Hide
          Bryan Baugher added a comment -

          Well that would be because I forgot to include my changes to Scan. Done.

          Show
          Bryan Baugher added a comment - Well that would be because I forgot to include my changes to Scan. Done.
          Hide
          Ted Yu added a comment -

          Patch v9 gave me:

          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project hbase-server: Compilation failure
          [ERROR] /Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java:[114,47] cannot find symbol
          [ERROR] symbol  : variable SCAN_ATTRIBUTES_TABLE_NAME
          [ERROR] location: class org.apache.hadoop.hbase.client.Scan
          
          Show
          Ted Yu added a comment - Patch v9 gave me: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile ( default -compile) on project hbase-server: Compilation failure [ERROR] /Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java:[114,47] cannot find symbol [ERROR] symbol : variable SCAN_ATTRIBUTES_TABLE_NAME [ERROR] location: class org.apache.hadoop.hbase.client.Scan
          Hide
          Bryan Baugher added a comment -

          Well thats not a great way to start off. Lets try this again.

          Show
          Bryan Baugher added a comment - Well thats not a great way to start off. Lets try this again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563355/3996-v8.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3863//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563355/3996-v8.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3863//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Thanks Bryan. Sorry I did not get to it as promised.

          Show
          Lars Hofhansl added a comment - Thanks Bryan. Sorry I did not get to it as promised.
          Hide
          Bryan Baugher added a comment -

          I would like to offer to finish this issue. If you would rather close this issue or start a new one that is fine just let me know.

          Here is what I have done from the previous version,

          • Removed random formatting changes
          • Removed table.close() in TableRecordReaderImpl
          • Replaced MultiTableInputCollection with List<Scan>
          • Brought up to date with trunk

          Remaining questions,

          • Since most of enum Version code is copied, we may want to factor the base enum to its own class. Would org.apache.hadoop.hbase.util be a good namespace for the enum class ?
          • The changes to TableSplit would not allow a new version of it to be deserialized by an old server. Is that OK for a M/R job?
          • It has been mentioned to scope this to scans (of a single table) rather then multiple tables.

          I can't seem to get the tests to run for me (getting OOM errors) but I would imagine most everything still works.

          Show
          Bryan Baugher added a comment - I would like to offer to finish this issue. If you would rather close this issue or start a new one that is fine just let me know. Here is what I have done from the previous version, Removed random formatting changes Removed table.close() in TableRecordReaderImpl Replaced MultiTableInputCollection with List<Scan> Brought up to date with trunk Remaining questions, Since most of enum Version code is copied, we may want to factor the base enum to its own class. Would org.apache.hadoop.hbase.util be a good namespace for the enum class ? The changes to TableSplit would not allow a new version of it to be deserialized by an old server. Is that OK for a M/R job? It has been mentioned to scope this to scans (of a single table) rather then multiple tables. I can't seem to get the tests to run for me (getting OOM errors) but I would imagine most everything still works.
          Hide
          Lars Hofhansl added a comment -

          Moving on

          Show
          Lars Hofhansl added a comment - Moving on
          Hide
          Shawn Quinn added a comment -

          So, this is something we'd really start liking to use here as well, as we're trying to stay within the released HBase APIs (so, we're currently using a custom TableInputFormatBase extension which hasn't been ideal.) Based on the comments here and the references to this ticket in the mailing list, it appears there's a good amount of interest in this enhancement. I've monkeyed with a few things within the HBase code locally here, but haven't yet tried to submit a patch.

          Lars/Stack, if you let me know you wouldn't mind another person's contribution being added to the mix here, I'd be glad to give this one a go and submit an updated patch. I don't want cause you guys any headaches if adding another person into the mix is just going to complicate or slow this one down though.

          Show
          Shawn Quinn added a comment - So, this is something we'd really start liking to use here as well, as we're trying to stay within the released HBase APIs (so, we're currently using a custom TableInputFormatBase extension which hasn't been ideal.) Based on the comments here and the references to this ticket in the mailing list, it appears there's a good amount of interest in this enhancement. I've monkeyed with a few things within the HBase code locally here, but haven't yet tried to submit a patch. Lars/Stack, if you let me know you wouldn't mind another person's contribution being added to the mix here, I'd be glad to give this one a go and submit an updated patch. I don't want cause you guys any headaches if adding another person into the mix is just going to complicate or slow this one down though.
          Hide
          Lars Hofhansl added a comment -

          Oh well... Probably not getting to it.

          Show
          Lars Hofhansl added a comment - Oh well... Probably not getting to it.
          Hide
          Lars Hofhansl added a comment -

          I'm going to see if I can finish this in the next few days.

          Show
          Lars Hofhansl added a comment - I'm going to see if I can finish this in the next few days.
          Hide
          Lars Hofhansl added a comment -

          Let's change the naming and get it done. Eran and Bohdan seem to be MIA.
          I'd volunteer to finish this.

          Show
          Lars Hofhansl added a comment - Let's change the naming and get it done. Eran and Bohdan seem to be MIA. I'd volunteer to finish this.
          Hide
          Lars Hofhansl added a comment -

          Moving on.

          Show
          Lars Hofhansl added a comment - Moving on.
          Hide
          stack added a comment -

          Yes, packaging/naming issues.

          Show
          stack added a comment - Yes, packaging/naming issues.
          Hide
          Lars Hofhansl added a comment -

          @Eran and Bohdan: Are you still interested in finishing this?
          @Stack: So these are naming and package issues? Or would you restructure the code?

          Show
          Lars Hofhansl added a comment - @Eran and Bohdan: Are you still interested in finishing this? @Stack: So these are naming and package issues? Or would you restructure the code?
          Hide
          stack added a comment -

          @Stack: Could you make sure that your comments are addressed?\

          They haven't been. I can't get past the first class. It implements Iterable but it looks like its also a Collection but doesn't implement Collection Interfaces and then, it talks about being an Input – both in the Wrapper class and the internal TableInputConf (which is a holder for an HTable and a tablename only... no conf, no input) – but it has nought to do w/ MR Input.

          How we want to proceed? Want me to review fully the last patch put up?

          Show
          stack added a comment - @Stack: Could you make sure that your comments are addressed?\ They haven't been. I can't get past the first class. It implements Iterable but it looks like its also a Collection but doesn't implement Collection Interfaces and then, it talks about being an Input – both in the Wrapper class and the internal TableInputConf (which is a holder for an HTable and a tablename only... no conf, no input) – but it has nought to do w/ MR Input. How we want to proceed? Want me to review fully the last patch put up?
          Hide
          Lars Hofhansl added a comment -

          Looking at it again and reviewing the comments and the latest version of RB this looks good. Not sure why it got stuck.

          A remaining question is 0.94 or not. The changes to TableSplit would not allow a new version of it to be deserialized by an old server. Is that OK for an M/R job?
          Also, the comment I had about that extra table.close TableRecordReaderImpl.java. If that is a bug I would prefer that in a separate jira (unless other changes here necessitate this close, but I do not think so).

          @Stack: Could you make sure that your comments are addressed?

          Show
          Lars Hofhansl added a comment - Looking at it again and reviewing the comments and the latest version of RB this looks good. Not sure why it got stuck. A remaining question is 0.94 or not. The changes to TableSplit would not allow a new version of it to be deserialized by an old server. Is that OK for an M/R job? Also, the comment I had about that extra table.close TableRecordReaderImpl.java. If that is a bug I would prefer that in a separate jira (unless other changes here necessitate this close, but I do not think so). @Stack: Could you make sure that your comments are addressed?
          Hide
          Lars Hofhansl added a comment -

          Somehow I missed this (probably because of HBaseCon and vacation in June). Apologies for that.
          Let's finish this and get it in.

          I'll look at the patch again tomorrow.

          Show
          Lars Hofhansl added a comment - Somehow I missed this (probably because of HBaseCon and vacation in June). Apologies for that. Let's finish this and get it in. I'll look at the patch again tomorrow.
          Hide
          Patrick Yu added a comment -

          @Ming Ma
          I'm not so sure about multi-table inputs, but multi-scan is very useful in cases where the row keys are prefixed with a salt value in order to avoid the hot region problem. For example, if the row keys are like byte(0-63) + actual timestamp, then using one scan with the specific prefix per region (map task) would be less expensive than using very complicated filters for the same purpose.

          Show
          Patrick Yu added a comment - @Ming Ma I'm not so sure about multi-table inputs, but multi-scan is very useful in cases where the row keys are prefixed with a salt value in order to avoid the hot region problem. For example, if the row keys are like byte(0-63) + actual timestamp, then using one scan with the specific prefix per region (map task) would be less expensive than using very complicated filters for the same purpose.
          Hide
          Bohdan Mushkevych added a comment -

          @Lars Hofhansl
          @Zhihong Yu
          @Stack

          Gentlemen
          It would be a great pity to miss this functionality...
          Could you come up with either blocking requirement or give "green light" for the patch?

          Show
          Bohdan Mushkevych added a comment - @Lars Hofhansl @Zhihong Yu @Stack Gentlemen It would be a great pity to miss this functionality... Could you come up with either blocking requirement or give "green light" for the patch?
          Hide
          Ted Yu added a comment -

          There're a few suggestions from Stack pending.

          @Stack:
          Can you take a look at Eran's comments from Apr 5th ?

          Show
          Ted Yu added a comment - There're a few suggestions from Stack pending. @Stack: Can you take a look at Eran's comments from Apr 5th ?
          Hide
          Bohdan Mushkevych added a comment -

          @stack, @Ted Yu, @Lars Hofhansl
          Gentlemen, it seams that last changes [1] were submitted 4 weeks ago.
          My personal fear is that the ticket will get "outdated" due to Trunk changes and will miss 0.94-0.96 target.

          [1] Diff r7
          https://reviews.apache.org/r/4411/diff/7/

          Show
          Bohdan Mushkevych added a comment - @stack, @Ted Yu, @Lars Hofhansl Gentlemen, it seams that last changes [1] were submitted 4 weeks ago. My personal fear is that the ticket will get "outdated" due to Trunk changes and will miss 0.94-0.96 target. [1] Diff r7 https://reviews.apache.org/r/4411/diff/7/
          Hide
          Eran Kutner added a comment -

          Just to give better reasoning why I feel it is unnatural. With my method someone using this functionality for the first time would be able to figure it out just by looking at the class names and interface definitions (using IDE auto completion for example), while the only way to know it is required to set that attribute is to dig in the documentation.

          Show
          Eran Kutner added a comment - Just to give better reasoning why I feel it is unnatural. With my method someone using this functionality for the first time would be able to figure it out just by looking at the class names and interface definitions (using IDE auto completion for example), while the only way to know it is required to set that attribute is to dig in the documentation.
          Hide
          Eran Kutner added a comment -

          @stack: I believe the only open issue in the review board is your suggestion to replace my MultiTableInputCollection with a List<Scan>. Although I agree it would make the patch simpler and allow it to have one less class, I think it will make using it less natural. Developers will have to create a Scan which is a common object and then set a table attribute. This feels less natural to me than setting the table by adding to a collection the way I've done it, but I guess it's a matter of perspective.

          Show
          Eran Kutner added a comment - @stack: I believe the only open issue in the review board is your suggestion to replace my MultiTableInputCollection with a List<Scan>. Although I agree it would make the patch simpler and allow it to have one less class, I think it will make using it less natural. Developers will have to create a Scan which is a common object and then set a table attribute. This feels less natural to me than setting the table by adding to a collection the way I've done it, but I guess it's a matter of perspective.
          Hide
          Bohdan Mushkevych added a comment -

          @Ming Ma:
          Described functionality is essential to make JOINs; or to process multiple regions from the same table.
          While trying to merge 2+ datasets together, you better be aware what structures you are processing.

          Show
          Bohdan Mushkevych added a comment - @Ming Ma: Described functionality is essential to make JOINs; or to process multiple regions from the same table. While trying to merge 2+ datasets together, you better be aware what structures you are processing.
          Hide
          Ming Ma added a comment -

          Appreciate if anyone can clarify the type of applications that could benefit from this.

          1. Does this work try to help with hbase map reduce job performance? If so, Eran, do you have any data for that? Couple months I tried scanning multiple regions in one mapper task, that only helps if the mapper task takes less than couple minutes and thus map reduce task scheduling becomes the overhead.

          2. In the multitable scenario, if we assume different tables have different schemes, does that mean the application mapper implementation need to take care of input from different tables?

          Show
          Ming Ma added a comment - Appreciate if anyone can clarify the type of applications that could benefit from this. 1. Does this work try to help with hbase map reduce job performance? If so, Eran, do you have any data for that? Couple months I tried scanning multiple regions in one mapper task, that only helps if the mapper task takes less than couple minutes and thus map reduce task scheduling becomes the overhead. 2. In the multitable scenario, if we assume different tables have different schemes, does that mean the application mapper implementation need to take care of input from different tables?
          Hide
          Ted Yu added a comment -

          I agree if HBase dependencies are shipped as part of the MR job jar, there is no need to worry about versioning of TableSplit.

          Show
          Ted Yu added a comment - I agree if HBase dependencies are shipped as part of the MR job jar, there is no need to worry about versioning of TableSplit.
          Hide
          Todd Lipcon added a comment -

          hbase jar on job tracker is updated to include the versioning mechanism but the job client has pre-versioning hbase jar.

          The jar on the JT doesn't matter. Split computation and interpretation happens only in the user code – i.e on the client machine and inside the tasks themselves. So you don't need HBase installed on the JT at all. As for the TTs, it's possible to configure the TTs to put an hbase jar on the classpath, but I usually recommend against it for the exact reason you're mentioning - if the jars differ in version, and they're not 100% API compatible, you can get nasty errors. The recommended deployment is to not put hbase on the TT classpath, and instead ship the HBase dependencies as part of the MR job, using the provided utility function in TableMapReduceUtil.

          Show
          Todd Lipcon added a comment - hbase jar on job tracker is updated to include the versioning mechanism but the job client has pre-versioning hbase jar. The jar on the JT doesn't matter. Split computation and interpretation happens only in the user code – i.e on the client machine and inside the tasks themselves. So you don't need HBase installed on the JT at all. As for the TTs, it's possible to configure the TTs to put an hbase jar on the classpath, but I usually recommend against it for the exact reason you're mentioning - if the jars differ in version, and they're not 100% API compatible, you can get nasty errors. The recommended deployment is to not put hbase on the TT classpath, and instead ship the HBase dependencies as part of the MR job, using the provided utility function in TableMapReduceUtil.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520294/3996-v7.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1333//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1333//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1333//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520294/3996-v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1333//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1333//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1333//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          w.r.t. Todd's question above, the versioning is trying to solve the following scenario:
          hbase jar on job tracker is updated to include the versioning mechanism but the job client has pre-versioning hbase jar.

          Show
          Ted Yu added a comment - w.r.t. Todd's question above, the versioning is trying to solve the following scenario: hbase jar on job tracker is updated to include the versioning mechanism but the job client has pre-versioning hbase jar.
          Hide
          Ted Yu added a comment -

          Test failure for patch v6 was due to MAPREDUCE-3583:

          attempt_20120328173919786_0001_m_000100_1: 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
          java.lang.NumberFormatException: For input string: "18446743988103913343"
          	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
          	at java.lang.Long.parseLong(Long.java:422)
          	at java.lang.Long.parseLong(Long.java:468)
          	at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
          	at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
          	at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
          	at org.apache.hadoop.mapred.Task.initialize(Task.java:536)
          	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
          	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
          
          Show
          Ted Yu added a comment - Test failure for patch v6 was due to MAPREDUCE-3583 : attempt_20120328173919786_0001_m_000100_1: at org.apache.hadoop.mapred.Child.main(Child.java:249) java.lang.NumberFormatException: For input string: "18446743988103913343" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang. Long .parseLong( Long .java:422) at java.lang. Long .parseLong( Long .java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
          Hide
          Todd Lipcon added a comment -

          The other question is whether we need version compatibility at all for this enum. The split object is created when you submit the job, and then only used by that one job, right? i.e it's never persisted or transferred over the wire to some other process, is it?

          Show
          Todd Lipcon added a comment - The other question is whether we need version compatibility at all for this enum. The split object is created when you submit the job, and then only used by that one job, right? i.e it's never persisted or transferred over the wire to some other process, is it?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520290/3996-v6.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
          org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1332//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1332//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1332//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520290/3996-v6.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1332//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1332//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1332//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Rather than do manual versioning, why not switch this to a protobuf? Then you avoid the manual serialization and you don't have to worry about versioning.

          Show
          Todd Lipcon added a comment - Rather than do manual versioning, why not switch this to a protobuf? Then you avoid the manual serialization and you don't have to worry about versioning.
          Hide
          Ted Yu added a comment -

          Patch v7 introduces versioning for TableSplit, using the same tactic used for HLogKey.

          Since most of enum Version code is copied, we may want to factor the base enum to its own class. Would org.apache.hadoop.hbase.util be a good namespace for the enum class ?

          Show
          Ted Yu added a comment - Patch v7 introduces versioning for TableSplit, using the same tactic used for HLogKey. Since most of enum Version code is copied, we may want to factor the base enum to its own class. Would org.apache.hadoop.hbase.util be a good namespace for the enum class ?
          Hide
          Ted Yu added a comment -

          Currently TableSplit class is marked Stable. With the addition of Scan member, I plan to change the label to evolving.

          Show
          Ted Yu added a comment - Currently TableSplit class is marked Stable. With the addition of Scan member, I plan to change the label to evolving.
          Hide
          Ted Yu added a comment -

          Patch v6 is same as Eran's patch v5, formatted to be accepted by review board.

          Show
          Ted Yu added a comment - Patch v6 is same as Eran's patch v5, formatted to be accepted by review board.
          Hide
          Eran Kutner added a comment -

          There is one pending change I know about, and that is making TableInputConf a static inner class. As for versionning I'll look at it but can't say when.
          Other than that I'm waiting to hear back from @Lars regarding my response to his suggestions on reusing TableInputFormatBase.

          Sorry for being slow to respond, I'm very busy with other things these days, so feel free to make any changes you feel are right.

          Show
          Eran Kutner added a comment - There is one pending change I know about, and that is making TableInputConf a static inner class. As for versionning I'll look at it but can't say when. Other than that I'm waiting to hear back from @Lars regarding my response to his suggestions on reusing TableInputFormatBase. Sorry for being slow to respond, I'm very busy with other things these days, so feel free to make any changes you feel are right.
          Hide
          Ted Yu added a comment -

          @Eran:
          If there is nothing in principal (from review comments) you disagree with, I and Stack can help with refining the latest patch.

          For introducing version, see HLogKey.readFields() where there is ample document on how the strategy was developed.

          Show
          Ted Yu added a comment - @Eran: If there is nothing in principal (from review comments) you disagree with, I and Stack can help with refining the latest patch. For introducing version, see HLogKey.readFields() where there is ample document on how the strategy was developed.
          Hide
          Ted Yu added a comment -

          Uploaded Eran's patch to https://reviews.apache.org/r/4411/

          @Stack:
          Can you take a second look to see if it is up to your expectation ?

          Show
          Ted Yu added a comment - Uploaded Eran's patch to https://reviews.apache.org/r/4411/ @Stack: Can you take a second look to see if it is up to your expectation ?
          Hide
          Eran Kutner added a comment -

          Made some changes following @stack review. DOn't know how to submit for review again.

          Show
          Eran Kutner added a comment - Made some changes following @stack review. DOn't know how to submit for review again.
          Hide
          Lars Hofhansl added a comment -

          Don't think this is going to reader by the time I want to cut 0.94.
          Can revisit for 0.94.1.

          Show
          Lars Hofhansl added a comment - Don't think this is going to reader by the time I want to cut 0.94. Can revisit for 0.94.1.
          Hide
          Ted Yu added a comment -

          @Stack:
          Thanks for the detailed review.

          @Eran:
          If you don't have time to respond, please let me know.

          Show
          Ted Yu added a comment - @Stack: Thanks for the detailed review. @Eran: If you don't have time to respond, please let me know.
          Hide
          Ted Yu added a comment -

          I ran TestMultiTableInputFormat on patch v4 on MacBook and it passed.

          Running org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
          Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 447.396 sec
          
          Results :
          
          Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
          
          [INFO] 
          [INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ hbase ---
          [INFO] Tests are skipped.
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD SUCCESS
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 7:38.220s
          

          Please provide comment about this feature here or on https://reviews.apache.org/r/4411/

          Show
          Ted Yu added a comment - I ran TestMultiTableInputFormat on patch v4 on MacBook and it passed. Running org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 447.396 sec Results : Tests run: 5, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ hbase --- [INFO] Tests are skipped. [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 7:38.220s Please provide comment about this feature here or on https://reviews.apache.org/r/4411/
          Hide
          Lars Hofhansl added a comment -

          @Bohdan: Please see my comment on review board regarding closing the table when we close the split.

          Show
          Lars Hofhansl added a comment - @Bohdan: Please see my comment on review board regarding closing the table when we close the split.
          Hide
          Ted Yu added a comment -

          w.r.t. test failure:
          https://builds.apache.org/job/PreCommit-HBASE-Build/1230//testReport/org.apache.hadoop.hbase.mapreduce/TestMultiTableInputFormat/testScanEmptyToEmpty/

          You can find the following:

          2012-03-20 05:32:21,778 DEBUG [pool-1-thread-1] mapreduce.MultiTableInputFormatBase(143): getSplits: split -> 24 -> asf011.sp2.ygridcore.net:yyy,
          java.lang.NumberFormatException: For input string: "18446743988250694508"
          	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
          	at java.lang.Long.parseLong(Long.java:422)
          	at java.lang.Long.parseLong(Long.java:468)
          	at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
          	at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
          	at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
          	at org.apache.hadoop.mapred.Task.initialize(Task.java:536)
          	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
          	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:396)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
          

          We really need next release of hadoop 1.0 where MAPREDUCE-3583 is fixed.

          w.r.t. Bohdan's question, there're two outstanding review comments on https://reviews.apache.org/r/4411/ where response from Eran would help clarify.

          Show
          Ted Yu added a comment - w.r.t. test failure: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//testReport/org.apache.hadoop.hbase.mapreduce/TestMultiTableInputFormat/testScanEmptyToEmpty/ You can find the following: 2012-03-20 05:32:21,778 DEBUG [pool-1-thread-1] mapreduce.MultiTableInputFormatBase(143): getSplits: split -> 24 -> asf011.sp2.ygridcore.net:yyy, java.lang.NumberFormatException: For input string: "18446743988250694508" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang. Long .parseLong( Long .java:422) at java.lang. Long .parseLong( Long .java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) We really need next release of hadoop 1.0 where MAPREDUCE-3583 is fixed. w.r.t. Bohdan's question, there're two outstanding review comments on https://reviews.apache.org/r/4411/ where response from Eran would help clarify.
          Hide
          Bohdan Mushkevych added a comment -

          Gentlemen, let me reach for clarification.
          Current status of this ticket implies that it still requires any action, or it is ready to be merged into TRUNK?

          Show
          Bohdan Mushkevych added a comment - Gentlemen, let me reach for clarification. Current status of this ticket implies that it still requires any action, or it is ready to be merged into TRUNK?
          Hide
          Eran Kutner added a comment -

          Sorry for missing all the action, I was offline for a couple of days.
          Thanks Ted and everyone else for pushing this forward.

          Show
          Eran Kutner added a comment - Sorry for missing all the action, I was offline for a couple of days. Thanks Ted and everyone else for pushing this forward.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12519023/3996-v4.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 166 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519023/3996-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 166 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1230//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Latest patch from review board.

          Show
          Ted Yu added a comment - Latest patch from review board.
          Hide
          Ted Yu added a comment -

          Eran might be busy.

          I created https://reviews.apache.org/r/4411/ for people to review.

          Show
          Ted Yu added a comment - Eran might be busy. I created https://reviews.apache.org/r/4411/ for people to review.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12518989/3996-v3.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 166 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1227//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1227//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1227//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518989/3996-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 166 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1227//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1227//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1227//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v3 compiles

          I reformatted some of the new code.

          Show
          Ted Yu added a comment - Patch v3 compiles I reformatted some of the new code.
          Hide
          Ted Yu added a comment -

          Adding 0.94 according to Lars' feedback.

          Show
          Ted Yu added a comment - Adding 0.94 according to Lars' feedback.
          Hide
          Lars Hofhansl added a comment -

          I am not opposed to having this in 0.94. Seems quite useful and performance related

          Show
          Lars Hofhansl added a comment - I am not opposed to having this in 0.94. Seems quite useful and performance related
          Hide
          Ted Yu added a comment -

          I think it would be better for Eran to create the review request.

          @Eran:
          For the new public classes such as MultiTableInputFormatBase, please add @InterfaceAudience annotation.

          For TestMultiTableInputFormat, please label it:

          @Category(LargeTests.class)
          

          Thanks

          Show
          Ted Yu added a comment - I think it would be better for Eran to create the review request. @Eran: For the new public classes such as MultiTableInputFormatBase, please add @InterfaceAudience annotation. For TestMultiTableInputFormat, please label it: @Category(LargeTests.class) Thanks
          Hide
          Lars Hofhansl added a comment -

          Even v2 has a bunch of formatting/whitespace changes. While those are good changes, they kinda obscure the interesting changes.
          Any chance to put this up on RB (in that case with the whitespace fixes)?

          Show
          Lars Hofhansl added a comment - Even v2 has a bunch of formatting/whitespace changes. While those are good changes, they kinda obscure the interesting changes. Any chance to put this up on RB (in that case with the whitespace fixes)?
          Hide
          Ted Yu added a comment -

          Patch v2 is smaller than Eran's patch because I didn't apply trivial formatting changes.

          Show
          Ted Yu added a comment - Patch v2 is smaller than Eran's patch because I didn't apply trivial formatting changes.
          Hide
          Ted Yu added a comment -

          There is one compilation error:

          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project hbase: Compilation failure
          [ERROR] /Users/zhihyu/trunk-hbase/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java:[85,7] cannot find symbol
          [ERROR] symbol  : method init()
          [ERROR] location: class org.apache.hadoop.hbase.mapreduce.TableRecordReader
          

          If I change the init() call to the following:

          trr.initialize(split, context);
          

          compiler complains that InterruptedException isn't handled.

          @Eran:
          Please address the above based on modified patch and make sure TestMultiTableInputFormat passes.

          Show
          Ted Yu added a comment - There is one compilation error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile ( default -compile) on project hbase: Compilation failure [ERROR] /Users/zhihyu/trunk-hbase/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java:[85,7] cannot find symbol [ERROR] symbol : method init() [ERROR] location: class org.apache.hadoop.hbase.mapreduce.TableRecordReader If I change the init() call to the following: trr.initialize(split, context); compiler complains that InterruptedException isn't handled. @Eran: Please address the above based on modified patch and make sure TestMultiTableInputFormat passes.
          Hide
          Ted Yu added a comment -

          @Eran:
          There was unnecessary formatting in your patch:

          +     if (jars.isEmpty())
          +       return;
          

          The convention is to put 'return' on the same line or use curly braces otherwise.

          Show
          Ted Yu added a comment - @Eran: There was unnecessary formatting in your patch: + if (jars.isEmpty()) + return ; The convention is to put 'return' on the same line or use curly braces otherwise.
          Hide
          Ted Yu added a comment -

          I am manually resolving the conflicts since the patch is one month old.

          @Eran:
          HbaseObjectWritable supports Scan. Is there special reason why String form of Scan is stored in TableSplit.java ?

          Show
          Ted Yu added a comment - I am manually resolving the conflicts since the patch is one month old. @Eran: HbaseObjectWritable supports Scan. Is there special reason why String form of Scan is stored in TableSplit.java ?
          Hide
          Bohdan Mushkevych added a comment -

          It is so unfortunate that such useful functionality is gathering dust on shelves.

          Show
          Bohdan Mushkevych added a comment - It is so unfortunate that such useful functionality is gathering dust on shelves.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514793/HBase-3996.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1222//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514793/HBase-3996.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1222//console This message is automatically generated.
          Hide
          Bohdan Mushkevych added a comment -

          Eran - thank you for updated patch.
          Zhihong Yu - have you had a chance to apply it on TRUNK?

          Show
          Bohdan Mushkevych added a comment - Eran - thank you for updated patch. Zhihong Yu - have you had a chance to apply it on TRUNK?
          Hide
          Eran Kutner added a comment -

          I now remember this was a patch file I tried to manipulate manually to reduce some extra stuff that was included and Stack didn't like.
          I regenerated the patch file from TRUNK, but it still have some unnecessary stuff in it.

          Show
          Eran Kutner added a comment - I now remember this was a patch file I tried to manipulate manually to reduce some extra stuff that was included and Stack didn't like. I regenerated the patch file from TRUNK, but it still have some unnecessary stuff in it.
          Hide
          Eran Kutner added a comment -

          It was merging fine when I posted it about 7 months ago. I assume a lot has changed in TRUNK since.
          I'll take a look at it but can't promise a ETA.

          Show
          Eran Kutner added a comment - It was merging fine when I posted it about 7 months ago. I assume a lot has changed in TRUNK since. I'll take a look at it but can't promise a ETA.
          Hide
          Ted Yu added a comment -

          I couldn't apply the patch against TRUNK:

          zhihyu$ p0 MultiTableInputFormat.patch 
          patching file src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java
          patch: **** malformed patch at line 217: @@ -179,18 +219,21 @@
          

          @Evan:
          Can you combine the patch and test ?

          Thanks

          Show
          Ted Yu added a comment - I couldn't apply the patch against TRUNK: zhihyu$ p0 MultiTableInputFormat.patch patching file src/main/java/org/apache/hadoop/hbase/mapreduce/TableSplit.java patch: **** malformed patch at line 217: @@ -179,18 +219,21 @@ @Evan: Can you combine the patch and test ? Thanks
          Hide
          Bohdan Mushkevych added a comment -

          Proposed functionality will be very useful.
          Particular for HBase Table JOIN operations.

          When can we expect this patch to be in any major release?
          Has it been incorporated into TRUNK?

          Show
          Bohdan Mushkevych added a comment - Proposed functionality will be very useful. Particular for HBase Table JOIN operations. When can we expect this patch to be in any major release? Has it been incorporated into TRUNK?
          Hide
          Eran Kutner added a comment -

          Cleaned up the patch as much as I can, hopefully I didn't mess it up.

          Show
          Eran Kutner added a comment - Cleaned up the patch as much as I can, hopefully I didn't mess it up.
          Hide
          stack added a comment -

          Check out the patch yourself Eran. The bulk is reformatting like the following:

          -   * @param out  The output to write to.
          -   * @throws IOException When writing the values to the output fails.
          +   * 
          +   * @param out
          +   *          The output to write to.
          +   * @throws IOException
          +   *           When writing the values to the output fails.
          

          or

          -    return regionLocation + ":" +
          -      Bytes.toStringBinary(startRow) + "," + Bytes.toStringBinary(endRow);
          +    return regionLocation + ":" + Bytes.toStringBinary(startRow) + ","
          +        + Bytes.toStringBinary(endRow);
          

          Over in HBASE-3678 is a formatter for eclipse. Would that help?

          Show
          stack added a comment - Check out the patch yourself Eran. The bulk is reformatting like the following: - * @param out The output to write to. - * @ throws IOException When writing the values to the output fails. + * + * @param out + * The output to write to. + * @ throws IOException + * When writing the values to the output fails. or - return regionLocation + ":" + - Bytes.toStringBinary(startRow) + "," + Bytes.toStringBinary(endRow); + return regionLocation + ":" + Bytes.toStringBinary(startRow) + "," + + Bytes.toStringBinary(endRow); Over in HBASE-3678 is a formatter for eclipse. Would that help?
          Hide
          Eran Kutner added a comment -

          Thanks stack.

          I hope I finally got Eclipse to properly manage the tabs and line lengths (I'm not really a Java developer so this is all new to me).

          In TableSplit you create an HTable instance. Do you need to? And when you create it, though I believe it will be less of a problem going forward, can you use the constructor that takes a Configuration and table name? Is there a close in Split interface? If so, you might want to call close of your HTable in there. (Where is it used? Each split needs its own HTable?) Use the constructor that takes a Configuration here too...

          There are actually two issues here, I added the configuration and closed the table in getSplits(), that's the easy one.
          HTable per split is needed because it is used for reading the data from the split by the cluster nodes when the job is running. However, in order to support passing the configuration, I move the Htable creation out of TableSplit and into MutiTableInputFormatBase. I also modified TableRecordReaderImpl to close the table after reading all the records in the split. I believe this is OK, and the tests are passing fine, but it wasn't like that in the existing, single table, implementation so I hope I'm not missing (and messing) anything.

          You don't need the e.printStackTrace in below

          Right, removed and fixed the spelling in the warning.

          By any chance is the code here in MultiTableInputFormatBase where we are checking start and end rows copied from elsewhere?

          It's copied from TableInputFormatBase, as I said my code is closely based on the single table code.

          You remove the hashCode in TableSplit. Should it have one?

          I actually don't know if it needs one or not (it does seem to work fine without it) but I didn't remove it intentionally. I wrote my original code based on the 0.90.3 branch and when I copied to trunk I missed this change. It's back now.

          therwise patch looks great. Test too.

          Thanks!

          Hope that's it.

          Show
          Eran Kutner added a comment - Thanks stack. I hope I finally got Eclipse to properly manage the tabs and line lengths (I'm not really a Java developer so this is all new to me). In TableSplit you create an HTable instance. Do you need to? And when you create it, though I believe it will be less of a problem going forward, can you use the constructor that takes a Configuration and table name? Is there a close in Split interface? If so, you might want to call close of your HTable in there. (Where is it used? Each split needs its own HTable?) Use the constructor that takes a Configuration here too... There are actually two issues here, I added the configuration and closed the table in getSplits(), that's the easy one. HTable per split is needed because it is used for reading the data from the split by the cluster nodes when the job is running. However, in order to support passing the configuration, I move the Htable creation out of TableSplit and into MutiTableInputFormatBase. I also modified TableRecordReaderImpl to close the table after reading all the records in the split. I believe this is OK, and the tests are passing fine, but it wasn't like that in the existing, single table, implementation so I hope I'm not missing (and messing) anything. You don't need the e.printStackTrace in below Right, removed and fixed the spelling in the warning. By any chance is the code here in MultiTableInputFormatBase where we are checking start and end rows copied from elsewhere? It's copied from TableInputFormatBase, as I said my code is closely based on the single table code. You remove the hashCode in TableSplit. Should it have one? I actually don't know if it needs one or not (it does seem to work fine without it) but I didn't remove it intentionally. I wrote my original code based on the 0.90.3 branch and when I copied to trunk I missed this change. It's back now. therwise patch looks great. Test too. Thanks! Hope that's it.
          Hide
          stack added a comment - - edited

          FYI, patch has bunch of tabs in it instead of two spaces for tabs and some lines > 80 chars but no biggie – I can fix that on commit. Here's a few comments.

          In TableSplit you create an HTable instance. Do you need to? And when you create it, though I believe it will be less of a problem going forward, can you use the constructor that takes a Configuration and table name? Is there a close in Split interface? If so, you might want to call close of your HTable in there. (Where is it used? Each split needs its own HTable?) Use the constructor that takes a Configuration here too...

           +    HTable table = new HTable(tic.getTableName());$
          

          You don't need the e.printStackTrace in below

          +    Log.warn("Failed to convert Scan to Strting", e);$
          +    e.printStackTrace();$
          

          Nice javadoc.

          By any chance is the code here in MultiTableInputFormatBase where we are checking start and end rows copied from elsewhere?

          Otherwise patch looks great. Test too.

          The line above it will output the stack trace (spelling too!).

          You remove the hashCode in TableSplit. Should it have one?

          Show
          stack added a comment - - edited FYI, patch has bunch of tabs in it instead of two spaces for tabs and some lines > 80 chars but no biggie – I can fix that on commit. Here's a few comments. In TableSplit you create an HTable instance. Do you need to? And when you create it, though I believe it will be less of a problem going forward, can you use the constructor that takes a Configuration and table name? Is there a close in Split interface? If so, you might want to call close of your HTable in there. (Where is it used? Each split needs its own HTable?) Use the constructor that takes a Configuration here too... + HTable table = new HTable(tic.getTableName());$ You don't need the e.printStackTrace in below + Log.warn( "Failed to convert Scan to Strting" , e);$ + e.printStackTrace();$ Nice javadoc. By any chance is the code here in MultiTableInputFormatBase where we are checking start and end rows copied from elsewhere? Otherwise patch looks great. Test too. The line above it will output the stack trace (spelling too!). You remove the hashCode in TableSplit. Should it have one?
          Hide
          Eran Kutner added a comment -

          Should be better now.
          Cleaned up the javadocs and added a unit test based on the original TableInputFormat test.
          Let me know if there is anything I missed.

          Show
          Eran Kutner added a comment - Should be better now. Cleaned up the javadocs and added a unit test based on the original TableInputFormat test. Let me know if there is anything I missed.
          Hide
          stack added a comment -

          Its a new feature so it'll go into TRUNK not on to 0.90. Thats how we generally do it. I'd guess that branch patch will probably apply clean to trunk since its all new files anyways.

          Show
          stack added a comment - Its a new feature so it'll go into TRUNK not on to 0.90. Thats how we generally do it. I'd guess that branch patch will probably apply clean to trunk since its all new files anyways.
          Hide
          Eran Kutner added a comment -

          Thanks for the feedback @stack !

          Will take some time to get to it but I definitely will.
          As for a patch, I actually used the 0.90.3 sources, not through svn, but I'll try to make a patch out of it. Is it OK to create the patch against a branch or do you want it for trunk?

          Show
          Eran Kutner added a comment - Thanks for the feedback @stack ! Will take some time to get to it but I definitely will. As for a patch, I actually used the 0.90.3 sources, not through svn, but I'll try to make a patch out of it. Is it OK to create the patch against a branch or do you want it for trunk?
          Hide
          stack added a comment -

          @Eran Looks great.

          Some comments.

          Notice how other hbase classes use two spaces for tab and line lengths are 80 chars or less (generally).

          Check your javadoc. Some suffers copy and paste-isms (I think).

          Can you add the above usage into class comment or if that doesn't make sense, into the package javadoc. It would be a tradegy if this sweet new functionality went unused just because it could not be found.

          Any chance of a basic unit test. I know spinning up a MR cluster and an HBase cluster and an HDFS all in the one JVM is profane, but it does help features persevere through changes that don't seem related but because the unit test fails, the connection shows at unit test time rather than at deploy.

          (You know how to make a patch? If you svn add the new classes or git add and git commit, then you can make a patch file with new classes).

          Thanks Eran.

          Show
          stack added a comment - @Eran Looks great. Some comments. Notice how other hbase classes use two spaces for tab and line lengths are 80 chars or less (generally). Check your javadoc. Some suffers copy and paste-isms (I think). Can you add the above usage into class comment or if that doesn't make sense, into the package javadoc. It would be a tradegy if this sweet new functionality went unused just because it could not be found. Any chance of a basic unit test. I know spinning up a MR cluster and an HBase cluster and an HDFS all in the one JVM is profane, but it does help features persevere through changes that don't seem related but because the unit test fails, the connection shows at unit test time rather than at deploy. (You know how to make a patch? If you svn add the new classes or git add and git commit, then you can make a patch file with new classes). Thanks Eran.
          Hide
          Eran Kutner added a comment -

          I've added three new classes:
          MultiTableInputCollection is a collection of table+scanner pairs that should be used as input to the mapper.
          MultiTableInputFormatBase and MultiTableInputFormat are closely based on the "non-multi" version with required adaptations.
          I've also updated TableMapReduceUtil and TableSplit to support these new classes.

          Usage example:
          Scan scan1 = new Scan();
          scan1.setStartRow(start1);
          scan1.setStopRow(end1);

          Scan scan2 = new Scan();
          scan2.setStartRow(start2);
          scan2.setStopRow(end2);

          MultiTableInputCollection mtic = new MultiTableInputCollection();
          mtic.Add(tableName1, scan1);
          mtic.Add(tableName2, scan2);

          TableMapReduceUtil.initTableMapperJob(mtic, TestTableMapper.class, Text.class, IntWritable.class, job1);

          Show
          Eran Kutner added a comment - I've added three new classes: MultiTableInputCollection is a collection of table+scanner pairs that should be used as input to the mapper. MultiTableInputFormatBase and MultiTableInputFormat are closely based on the "non-multi" version with required adaptations. I've also updated TableMapReduceUtil and TableSplit to support these new classes. Usage example: Scan scan1 = new Scan(); scan1.setStartRow(start1); scan1.setStopRow(end1); Scan scan2 = new Scan(); scan2.setStartRow(start2); scan2.setStopRow(end2); MultiTableInputCollection mtic = new MultiTableInputCollection(); mtic.Add(tableName1, scan1); mtic.Add(tableName2, scan2); TableMapReduceUtil.initTableMapperJob(mtic, TestTableMapper.class, Text.class, IntWritable.class, job1);

            People

            • Assignee:
              Bryan Baugher
              Reporter:
              Eran Kutner
            • Votes:
              10 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development