HBase
  1. HBase
  2. HBASE-5564

Bulkload is discarding duplicate records

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: 0.95.0
    • Component/s: mapreduce
    • Labels:
    • Environment:

      HBase 0.92

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      1) Provision for using the existing timestamp (HBASE_TS_KEY)
      2) Bug fix to use same timestamp across mappers.
      Show
      1) Provision for using the existing timestamp (HBASE_TS_KEY) 2) Bug fix to use same timestamp across mappers.
    • Tags:
      bulkload, mapreduce, importtsv

      Description

      Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split.
      Duplicate records are considered if the records are from diffrent different splits.

      Version under test: HBase 0.92

      1. HBASE-5564_1.patch
        16 kB
        ramkrishna.s.vasudevan
      2. HBASE-5564.patch
        16 kB
        ramkrishna.s.vasudevan
      3. 5564v5.txt
        19 kB
        stack
      4. HBASE-5564_trunk.4_final.patch
        18 kB
        Laxman
      5. HBASE-5564_trunk.3.patch
        14 kB
        Laxman
      6. HBASE-5564_trunk.2.patch
        13 kB
        Laxman
      7. 5564.lint
        10 kB
        Ted Yu
      8. HBASE-5564_trunk.1.patch
        14 kB
        Laxman
      9. HBASE-5564_trunk.1.patch
        14 kB
        Laxman
      10. HBASE-5564_trunk.patch
        14 kB
        Laxman

        Activity

        Hide
        Laxman added a comment - - edited

        I think this is a bug and its not any intentional behavior.

        Usage of TreeSet in the below code snippet is causing the issue.

        PutSortReducer.reduce()
        ======================

              TreeSet<KeyValue> map = new TreeSet<KeyValue>(KeyValue.COMPARATOR);
              long curSize = 0;
              // stop at the end or the RAM threshold
              while (iter.hasNext() && curSize < threshold) {
                Put p = iter.next();
                for (List<KeyValue> kvs : p.getFamilyMap().values()) {
                  for (KeyValue kv : kvs) {
                    map.add(kv);
                    curSize += kv.getLength();
                  }
                }
        

        Changing this back to List and then sort explicitly will solve the issue.

        Show
        Laxman added a comment - - edited I think this is a bug and its not any intentional behavior. Usage of TreeSet in the below code snippet is causing the issue. PutSortReducer.reduce() ====================== TreeSet<KeyValue> map = new TreeSet<KeyValue>(KeyValue.COMPARATOR); long curSize = 0; // stop at the end or the RAM threshold while (iter.hasNext() && curSize < threshold) { Put p = iter.next(); for (List<KeyValue> kvs : p.getFamilyMap().values()) { for (KeyValue kv : kvs) { map.add(kv); curSize += kv.getLength(); } } Changing this back to List and then sort explicitly will solve the issue.
        Hide
        Laxman added a comment -

        I tested again with the proposed patch.
        > > Changing this back to List and then sort explicitly will solve the issue.

        Still the same problem persists making this issue bit more complicated.
        I think the usage of same timestamp for all records in split causing the issue.

        Currently in code,
        a) If configured, we are using static timestamp for all mappers.
        b) If not configured, we are using current system time generated for each split.

        TsvImporterMapper.doSetup
        ====================

        ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, System.currentTimeMillis());
        

        Should we think of an approach to generate a unique sequence number and use it as a timestamp?

        Any other thoughts?

        Show
        Laxman added a comment - I tested again with the proposed patch. > > Changing this back to List and then sort explicitly will solve the issue. Still the same problem persists making this issue bit more complicated. I think the usage of same timestamp for all records in split causing the issue. Currently in code, a) If configured, we are using static timestamp for all mappers. b) If not configured, we are using current system time generated for each split. TsvImporterMapper.doSetup ==================== ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, System .currentTimeMillis()); Should we think of an approach to generate a unique sequence number and use it as a timestamp? Any other thoughts?
        Hide
        Jesse Yates added a comment -

        Hmm, I think your right with this being a problem. It would be totally reasonable to change

               KeyValue kv = new KeyValue(
                    lineBytes, parsed.getRowKeyOffset(), parsed.getRowKeyLength(),
                    parser.getFamily(i), 0, parser.getFamily(i).length,
                    parser.getQualifier(i), 0, parser.getQualifier(i).length,
                    ts,
                    KeyValue.Type.Put,
                    lineBytes, parsed.getColumnOffset(i), parsed.getColumnLength(i));
        

        to use something like:

        ts++

        The question is, if you have a TSV file with the same row key, which value should be considered the most recent version? Should any of them - maybe that is actually a problem and we want to have a warning/error when that occurs?

        Show
        Jesse Yates added a comment - Hmm, I think your right with this being a problem. It would be totally reasonable to change KeyValue kv = new KeyValue( lineBytes, parsed.getRowKeyOffset(), parsed.getRowKeyLength(), parser.getFamily(i), 0, parser.getFamily(i).length, parser.getQualifier(i), 0, parser.getQualifier(i).length, ts, KeyValue.Type.Put, lineBytes, parsed.getColumnOffset(i), parsed.getColumnLength(i)); to use something like: ts++ The question is, if you have a TSV file with the same row key, which value should be considered the most recent version? Should any of them - maybe that is actually a problem and we want to have a warning/error when that occurs?
        Hide
        stack added a comment -

        The TreeSet is whats going to be used once the edits make it into the server so losing them in the reducer is probably optimal? The Jesse ts++, or ts--, could be an option?

        Show
        stack added a comment - The TreeSet is whats going to be used once the edits make it into the server so losing them in the reducer is probably optimal? The Jesse ts++, or ts--, could be an option?
        Hide
        Todd Lipcon added a comment -

        I think it's a feature, not a bug, that the timestamps are all identical. The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it. If you want to use custom timestamps, you should specify a timestamp column in your data, or write your own MR job (ImportTsv is just an example which use useful for some cases, but for anything advanced I would expect users to write their own code)

        Show
        Todd Lipcon added a comment - I think it's a feature, not a bug, that the timestamps are all identical. The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it. If you want to use custom timestamps, you should specify a timestamp column in your data, or write your own MR job (ImportTsv is just an example which use useful for some cases, but for anything advanced I would expect users to write their own code)
        Hide
        Lars Hofhansl added a comment -

        So this is only about ImportTsv? Should change the title in that case.

        I agree with Todd, at least for ImportTsv.
        Import/Export should not (and hopefully do not) exhibit this behavior (since we want to be able to import/export KVs with multiple versions).

        Show
        Lars Hofhansl added a comment - So this is only about ImportTsv? Should change the title in that case. I agree with Todd, at least for ImportTsv. Import/Export should not (and hopefully do not) exhibit this behavior (since we want to be able to import/export KVs with multiple versions).
        Hide
        Laxman added a comment -

        ts++, or ts--, could be an option?

        ts++ or ts-- will not solve this problem. Reason being each mapper spawns a new JVM and ts will be reset to initial value. so, still there is a chance of ts collision.

        that the timestamps are all identical. The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it.

        No Todd. At least the implementation is buggy enough and not matching with this expected behavior.
        New timestamp is generated for each map task (i.e., for each split) in TsvImporterMapper.doSetup.
        Please check my previous comments.

        So this is only about ImportTsv? Should change the title in that case.

        I'm not aware what other tools comes under bulkload. Bulkload documentation talks only about importtsv.
        http://hbase.apache.org/bulk-loads.html

        But if you feel we should change the title, feel free to modify the title.

        If you want to use custom timestamps, you should specify a timestamp column in your data, or write your own MR job (ImportTsv is just an example which use useful for some cases, but for anything advanced I would expect users to write their own code)

        I think we can provide the provision to specify the timestamp column (Like ROWKEY column) as arguments.
        Example : importtsv.columns='HBASE_ROW_KEY, HBASE_TS_KEY, emp:name,emp:sal,dept:code'

        This makes importtsv more usable. Otherwise, user has to copy paste entire importtsv code and do this minor modification.

        Please let me know your suggestions on this.

        Show
        Laxman added a comment - ts++, or ts--, could be an option? ts++ or ts-- will not solve this problem. Reason being each mapper spawns a new JVM and ts will be reset to initial value. so, still there is a chance of ts collision. that the timestamps are all identical. The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it. No Todd. At least the implementation is buggy enough and not matching with this expected behavior. New timestamp is generated for each map task (i.e., for each split) in TsvImporterMapper.doSetup. Please check my previous comments. So this is only about ImportTsv? Should change the title in that case. I'm not aware what other tools comes under bulkload. Bulkload documentation talks only about importtsv. http://hbase.apache.org/bulk-loads.html But if you feel we should change the title, feel free to modify the title. If you want to use custom timestamps, you should specify a timestamp column in your data, or write your own MR job (ImportTsv is just an example which use useful for some cases, but for anything advanced I would expect users to write their own code) I think we can provide the provision to specify the timestamp column (Like ROWKEY column) as arguments. Example : importtsv.columns='HBASE_ROW_KEY, HBASE_TS_KEY, emp:name,emp:sal,dept:code' This makes importtsv more usable. Otherwise, user has to copy paste entire importtsv code and do this minor modification. Please let me know your suggestions on this.
        Hide
        Ted Yu added a comment -

        I think we can provide the provision to specify the timestamp column (Like ROWKEY column) as arguments.

        The above is reasonable.

        Show
        Ted Yu added a comment - I think we can provide the provision to specify the timestamp column (Like ROWKEY column) as arguments. The above is reasonable.
        Hide
        Lars Hofhansl added a comment -

        @Laxman: so what you have in your CSV file is entries like:
        rowA, colA, val1
        rowA, colA, val2

        And the expectation is that HBase should create two versions:
        (rowA, colA, ts1) -> val1
        (rowA, colA, ts2) -> val2
        ?

        Seems like a pretty constructed case to me
        How would know ahead of time how many versions you'd need to configure for your column family? 3 is the default, but what if you have 100 versions of the same row/col combo in your CSV file?

        But anyway, having an option to specify a column for the TS is a good idea.
        Do you want to take a stab at it Laxman?

        Show
        Lars Hofhansl added a comment - @Laxman: so what you have in your CSV file is entries like: rowA, colA, val1 rowA, colA, val2 And the expectation is that HBase should create two versions: (rowA, colA, ts1) -> val1 (rowA, colA, ts2) -> val2 ? Seems like a pretty constructed case to me How would know ahead of time how many versions you'd need to configure for your column family? 3 is the default, but what if you have 100 versions of the same row/col combo in your CSV file? But anyway, having an option to specify a column for the TS is a good idea. Do you want to take a stab at it Laxman?
        Hide
        Laxman added a comment -

        Scope of this issue.

        1) Avoid the behavioral inconsistency with timestamp parameter.

        Currently in code,
        a) If timstamp parameter is configured, duplicate records will be overwritten.
        b) If not configured, some duplicate records are maintained as different version.
        

        This fix should be inline with the expectation Todd has mentioned.

        The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it.

        2) Provide an option to look up timestamp column value from input data. (Like ROWKEY column)
        Example : importtsv.columns='HBASE_ROW_KEY, HBASE_TS_KEY, emp:name,emp:sal,dept:code'

        I will submit the patch with the above mentioned approach.

        Any other addons?

        Show
        Laxman added a comment - Scope of this issue. 1) Avoid the behavioral inconsistency with timestamp parameter. Currently in code, a) If timstamp parameter is configured, duplicate records will be overwritten. b) If not configured, some duplicate records are maintained as different version. This fix should be inline with the expectation Todd has mentioned. The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it. 2) Provide an option to look up timestamp column value from input data. (Like ROWKEY column) Example : importtsv.columns='HBASE_ROW_KEY, HBASE_TS_KEY, emp:name,emp:sal,dept:code' I will submit the patch with the above mentioned approach. Any other addons?
        Hide
        Laxman added a comment -

        While testing the patch in local, I'm getting the following error in trunk.
        Any hints on this please?

        java.lang.RuntimeException: java.io.IOException: Call to localhost/127.0.0.1:0 failed on local exception: java.net.BindException: Cannot assign requested address: no further information
        	at org.apache.hadoop.mapred.MiniMRCluster.waitUntilIdle(MiniMRCluster.java:323)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:524)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:462)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:454)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:446)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:436)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:426)
        	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:417)
        	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniMapReduceCluster(HBaseTestingUtility.java:1269)
        	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniMapReduceCluster(HBaseTestingUtility.java:1255)
        	at org.apache.hadoop.hbase.mapreduce.TestImportTsv.doMROnTableTest(TestImportTsv.java:189)
        	at org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTable(TestImportTsv.java:162)
        
        Show
        Laxman added a comment - While testing the patch in local, I'm getting the following error in trunk. Any hints on this please? java.lang.RuntimeException: java.io.IOException: Call to localhost/127.0.0.1:0 failed on local exception: java.net.BindException: Cannot assign requested address: no further information at org.apache.hadoop.mapred.MiniMRCluster.waitUntilIdle(MiniMRCluster.java:323) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:524) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:462) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:454) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:446) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:436) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:426) at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:417) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniMapReduceCluster(HBaseTestingUtility.java:1269) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniMapReduceCluster(HBaseTestingUtility.java:1255) at org.apache.hadoop.hbase.mapreduce.TestImportTsv.doMROnTableTest(TestImportTsv.java:189) at org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTable(TestImportTsv.java:162)
        Hide
        stack added a comment -

        Googling it, its either something is already listening on the port of your 127.0.0.1 has been removed? See http://www-01.ibm.com/support/docview.wss?uid=swg21233733

        Show
        stack added a comment - Googling it, its either something is already listening on the port of your 127.0.0.1 has been removed? See http://www-01.ibm.com/support/docview.wss?uid=swg21233733
        Hide
        Laxman added a comment -

        Thanks Stack. Let me give a try.

        Show
        Laxman added a comment - Thanks Stack. Let me give a try.
        Hide
        Laxman added a comment -

        Initial patch on trunk for review.

        Show
        Laxman added a comment - Initial patch on trunk for review.
        Hide
        Ted Yu added a comment -
        +    public int getTimestapKeyColumnIndex() {
        

        Please fix typo in the above method name.

        +      "  -D" + TIMESTAMP_CONF_KEY + "=currentTimeAsLong - use the specified timestamp for the import. This option is ignored if HBASE_TS_KEY is specfied in 'importtsv.columns'\n" +
        

        Please wrap the long line above.

        +    // Should never get 0.
        +    ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0);
        

        Please explain why 0 wouldn't be returned.

        +      if (parser.getTimestapKeyColumnIndex() != -1)
        +        ts = parsed.getTimestamp();
        

        Please use curly braces around the assignment.

        Show
        Ted Yu added a comment - + public int getTimestapKeyColumnIndex() { Please fix typo in the above method name. + " -D" + TIMESTAMP_CONF_KEY + "=currentTimeAsLong - use the specified timestamp for the import . This option is ignored if HBASE_TS_KEY is specfied in 'importtsv.columns'\n" + Please wrap the long line above. + // Should never get 0. + ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0); Please explain why 0 wouldn't be returned. + if (parser.getTimestapKeyColumnIndex() != -1) + ts = parsed.getTimestamp(); Please use curly braces around the assignment.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12519017/HBASE-5564_trunk.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 165 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1229//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1229//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1229//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519017/HBASE-5564_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 165 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1229//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1229//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1229//console This message is automatically generated.
        Hide
        Anoop Sam John added a comment -

        @Laxman
        ImportTsv

        +    // If timestamp option is not specified, use current system time.
        +    long timstamp = conf.getLong(TIMESTAMP_CONF_KEY, System.currentTimeMillis());
        +
        +    // Set it back to replace invalid timestamp (non-numeric) with current system time
        +    conf.setLong(TIMESTAMP_CONF_KEY, timstamp);
        

        Doing this will use the same TS across all the mappers. Is this the intention for this change? So in TsvImporterMapper, conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0) will always have value to get from conf.

        Show
        Anoop Sam John added a comment - @Laxman ImportTsv + // If timestamp option is not specified, use current system time. + long timstamp = conf.getLong(TIMESTAMP_CONF_KEY, System .currentTimeMillis()); + + // Set it back to replace invalid timestamp (non-numeric) with current system time + conf.setLong(TIMESTAMP_CONF_KEY, timstamp); Doing this will use the same TS across all the mappers. Is this the intention for this change? So in TsvImporterMapper, conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0) will always have value to get from conf.
        Hide
        Laxman added a comment -

        Ted, Thanks for your review. Attached the patch after fixing the review comments.

        Show
        Laxman added a comment - Ted, Thanks for your review. Attached the patch after fixing the review comments.
        Hide
        Laxman added a comment -

        Doing this will use the same TS across all the mappers. Is this the intention for this change? So in TsvImporterMapper, conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0) will always have value to get from conf.

        Yes Anoop. we should have same timestamp for all mappers.
        Please check my previous comments on the scope of the issue.

        https://issues.apache.org/jira/browse/HBASE-5564?focusedCommentId=13228297&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13228297

        Show
        Laxman added a comment - Doing this will use the same TS across all the mappers. Is this the intention for this change? So in TsvImporterMapper, conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0) will always have value to get from conf. Yes Anoop. we should have same timestamp for all mappers. Please check my previous comments on the scope of the issue. https://issues.apache.org/jira/browse/HBASE-5564?focusedCommentId=13228297&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13228297
        Hide
        Laxman added a comment -

        QA bot didn't pick up previous patch. so, resubmitting...

        Show
        Laxman added a comment - QA bot didn't pick up previous patch. so, resubmitting...
        Hide
        Laxman added a comment -

        Any idea why QA bot is not testing this patch?
        Can someone trigger this explicitly?

        Show
        Laxman added a comment - Any idea why QA bot is not testing this patch? Can someone trigger this explicitly?
        Hide
        Anoop Sam John added a comment -

        Comment from Jesse Yates

        The question is, if you have a TSV file with the same row key, which value should be considered the most recent version? Should any of them - maybe that is actually a problem and we want to have a warning/error when that occurs?

        Do we need to handle this? The issue is TreeSet used by PutSortReducer and KeyValueSortReducer as mentioned by Laxman.
        In normal data insertion using Puts, all the duplicate values will go into the memstore (and finally to HFiles) and while scan the last entered one will get retrieved. In this bulk load case the 1st data only will get inserted as DS avoid the duplicates. Is this a behaviour mismatch? But this depends on which entry in the TSV file needs to be considered as the recent version.If we say that last entry coming in the file is the recent version.....

        Show
        Anoop Sam John added a comment - Comment from Jesse Yates The question is, if you have a TSV file with the same row key, which value should be considered the most recent version? Should any of them - maybe that is actually a problem and we want to have a warning/error when that occurs? Do we need to handle this? The issue is TreeSet used by PutSortReducer and KeyValueSortReducer as mentioned by Laxman. In normal data insertion using Puts, all the duplicate values will go into the memstore (and finally to HFiles) and while scan the last entered one will get retrieved. In this bulk load case the 1st data only will get inserted as DS avoid the duplicates. Is this a behaviour mismatch? But this depends on which entry in the TSV file needs to be considered as the recent version.If we say that last entry coming in the file is the recent version.....
        Hide
        Ted Yu added a comment -

        @Laxman:
        Please take a look at https://builds.apache.org/job/PreCommit-HBASE-Build/1229/console and see which test timed out.

        I have sent an email to builds@apache.org, informing them of the issue for Hadoop QA.

        Show
        Ted Yu added a comment - @Laxman: Please take a look at https://builds.apache.org/job/PreCommit-HBASE-Build/1229/console and see which test timed out. I have sent an email to builds@apache.org, informing them of the issue for Hadoop QA.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12519054/HBASE-5564_trunk.1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 165 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1231//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1231//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1231//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519054/HBASE-5564_trunk.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 165 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1231//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1231//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1231//console This message is automatically generated.
        Hide
        Laxman added a comment -

        All MR tests seems to be failing. Failures are not because of the patch.
        I will check these failures.

        @anoop
        In bulkload, if multiple records are having same timestamp, then the last KV entry processed by reducer only will be persisted (TreeSet in Reducer). I don't see this as behavior inconsistency. Bulkload can't judge which KV entry to be retained (Considering duplicate records exists across input splits/files). So, in this case, user can develop custom MR to achieve this functionality.

        Show
        Laxman added a comment - All MR tests seems to be failing. Failures are not because of the patch. I will check these failures. @anoop In bulkload, if multiple records are having same timestamp, then the last KV entry processed by reducer only will be persisted (TreeSet in Reducer). I don't see this as behavior inconsistency. Bulkload can't judge which KV entry to be retained (Considering duplicate records exists across input splits/files). So, in this case, user can develop custom MR to achieve this functionality.
        Hide
        Laxman added a comment -

        These tests are passing in my dev environment.

        Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
        Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 168.578 sec
        
        Results :
        
        Tests run: 9, Failures: 0, Errors: 0, Skipped: 0
        
        [INFO]
        [INFO] --- maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (secondPartTestsExecution) @ hbase ---
        [INFO] Tests are skipped.
        [INFO] ------------------------------------------------------------------------
        [INFO] BUILD SUCCESS
        [INFO] ------------------------------------------------------------------------
        

        Also, I can see these MR tests are failing in previous builds as well [HBase-5529].

        Will check more.

        Show
        Laxman added a comment - These tests are passing in my dev environment. Running org.apache.hadoop.hbase.mapreduce.TestImportTsv Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 168.578 sec Results : Tests run: 9, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] --- maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (secondPartTestsExecution) @ hbase --- [INFO] Tests are skipped. [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ Also, I can see these MR tests are failing in previous builds as well [HBase-5529] . Will check more.
        Hide
        ramkrishna.s.vasudevan added a comment -

        The test cases that fail is common in HadoopQA. As your patch is changing the ImportTsv part people will be worried.
        But as you have run it locally and ensured that it is passing the main build should be able to pass it.

        Show
        ramkrishna.s.vasudevan added a comment - The test cases that fail is common in HadoopQA. As your patch is changing the ImportTsv part people will be worried. But as you have run it locally and ensured that it is passing the main build should be able to pass it.
        Hide
        Laxman added a comment -

        thanks for the info Ram.

        I had spent sometime in analyzing these failures. But couldn't get a clue.
        Filed a separate JIRA HBASE-5608 to fix these test failures.

        As mentioned earlier all these test are passing in my local environment.

        Should we wait for HBASE-5608 or proceed with review & commit?

        Show
        Laxman added a comment - thanks for the info Ram. I had spent sometime in analyzing these failures. But couldn't get a clue. Filed a separate JIRA HBASE-5608 to fix these test failures. As mentioned earlier all these test are passing in my local environment. Should we wait for HBASE-5608 or proceed with review & commit?
        Hide
        Ted Yu added a comment -

        @Laxman:
        5564.lint contains the warnings 'arc lint' found w.r.t. your patch.

        Show
        Ted Yu added a comment - @Laxman: 5564.lint contains the warnings 'arc lint' found w.r.t. your patch.
        Hide
        Laxman added a comment -

        Ted, all these comments are related to line wrapping.
        IMO, 80 characters length is too low & it makes the code bit ugly.

        If you strongly feel we need to stick this 80-length, I will fix these comments.

        Show
        Laxman added a comment - Ted, all these comments are related to line wrapping. IMO, 80 characters length is too low & it makes the code bit ugly. If you strongly feel we need to stick this 80-length, I will fix these comments.
        Hide
        Ted Yu added a comment -

        We have been using 80 characters as line length for a while.

        At Google, line length is enforced, though the limit is bit longer.

        Feel free to start discussion on dev@hbase about the acceptable limit.

        Show
        Ted Yu added a comment - We have been using 80 characters as line length for a while. At Google, line length is enforced, though the limit is bit longer. Feel free to start discussion on dev@hbase about the acceptable limit.
        Hide
        Laxman added a comment -

        Thanks Ted, for taking pain in getting the lint comments.
        As you suggested, I will start a discussion on dev@hbase.

        I just wanted to quote one example from this patch here.

            long timstamp = conf.getLong(TIMESTAMP_CONF_KEY, System.currentTimeMillis());
        

        Above code snippet after formatting, it turned to

            long timstamp = conf
                .getLong(TIMESTAMP_CONF_KEY, System.currentTimeMillis());
        
        Show
        Laxman added a comment - Thanks Ted, for taking pain in getting the lint comments. As you suggested, I will start a discussion on dev@hbase. I just wanted to quote one example from this patch here. long timstamp = conf.getLong(TIMESTAMP_CONF_KEY, System .currentTimeMillis()); Above code snippet after formatting, it turned to long timstamp = conf .getLong(TIMESTAMP_CONF_KEY, System .currentTimeMillis());
        Hide
        Anoop Sam John added a comment -

        In bulkload, if multiple records are having same timestamp, then the last KV entry processed by reducer only will be persisted (TreeSet in Reducer)

        The 1st KV processed by the Reducer right...

        Yes agree with you which one is the latest might not be possible to be predicted in the reducer side...

        Show
        Anoop Sam John added a comment - In bulkload, if multiple records are having same timestamp, then the last KV entry processed by reducer only will be persisted (TreeSet in Reducer) The 1st KV processed by the Reducer right... Yes agree with you which one is the latest might not be possible to be predicted in the reducer side...
        Hide
        stack added a comment -

        Patch seems reasonable.

        Add curlies here:

        +      if (parser.getTimestampKeyColumnIndex() != -1)
        +        ts = parsed.getTimestamp();
        

        Convention is you can do w/o curlies if all in one line (as you do later in this file) but if not on one line, need curlies.

        Can you confirm that current behavior – setting ts to System.currentTimeMillis – is default? It seems to be ... we set System.currentTimeMillis as time to use setting up the job.

        A define for NO_TIMESTAMP_KEYCOLUMN_INDEX instead of using -1 directly might help for timestampKeyColumnIndex == -1? Or put this test into a method whose name makes it obvious what the test is about ... e.g. hasTimeStampColumn()....

        Patch adds nice usage commentary explaining new facility.

        Looks good.

        Show
        stack added a comment - Patch seems reasonable. Add curlies here: + if (parser.getTimestampKeyColumnIndex() != -1) + ts = parsed.getTimestamp(); Convention is you can do w/o curlies if all in one line (as you do later in this file) but if not on one line, need curlies. Can you confirm that current behavior – setting ts to System.currentTimeMillis – is default? It seems to be ... we set System.currentTimeMillis as time to use setting up the job. A define for NO_TIMESTAMP_KEYCOLUMN_INDEX instead of using -1 directly might help for timestampKeyColumnIndex == -1? Or put this test into a method whose name makes it obvious what the test is about ... e.g. hasTimeStampColumn().... Patch adds nice usage commentary explaining new facility. Looks good.
        Hide
        Laxman added a comment -

        @Anoop, thanks for clarification.

        @Stack, thanks for the review. I will update the patch.

        need curlies

        NO_TIMESTAMP_KEYCOLUMN_INDEX

        I will update the patch for above 2 comments.

        Can you confirm that current behavior – setting ts to System.currentTimeMillis – is default? It seems to be ... we set System.currentTimeMillis as time to use setting up the job.

        Before patch, we are setting ts to System.currentTimeMillis in TsvImporterMapper.doSetup. This setup methos will be called for each mapper, i.e, for each input split. That means it uses a new timestamp for each map task.

        After patch, we are setting ts to conf.getLong which is same in all map tasks.

        Hope, I understood your question correctly.

        Show
        Laxman added a comment - @Anoop, thanks for clarification. @Stack, thanks for the review. I will update the patch. need curlies NO_TIMESTAMP_KEYCOLUMN_INDEX I will update the patch for above 2 comments. Can you confirm that current behavior – setting ts to System.currentTimeMillis – is default? It seems to be ... we set System.currentTimeMillis as time to use setting up the job. Before patch, we are setting ts to System.currentTimeMillis in TsvImporterMapper.doSetup. This setup methos will be called for each mapper, i.e, for each input split. That means it uses a new timestamp for each map task. After patch, we are setting ts to conf.getLong which is same in all map tasks. Hope, I understood your question correctly.
        Hide
        Laxman added a comment -

        @Stack, updated the patch after fixing your comments. Thanks for the review.

        Show
        Laxman added a comment - @Stack, updated the patch after fixing your comments. Thanks for the review.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12519960/HBASE-5564_trunk.2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1305//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1305//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1305//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519960/HBASE-5564_trunk.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1305//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1305//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1305//console This message is automatically generated.
        Hide
        Laxman added a comment -

        Findbugs reported by QA bot are about usage of default encoding. This behavior is inline with existing code.

        bug #1

        TEST 	Unknown bug pattern DM_DEFAULT_ENCODING in org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$ParsedLine.getTimestamp()
        

        bug #2

        TEST 	Unknown bug pattern DM_DEFAULT_ENCODING in org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(Configuration, String[])
        

        bug #2 already existing in code. just included in patch file with no changes.

        And test case failures are not because of this patch. Test failures to be addressed as part of HBASE-5608

        Show
        Laxman added a comment - Findbugs reported by QA bot are about usage of default encoding. This behavior is inline with existing code. bug #1 TEST Unknown bug pattern DM_DEFAULT_ENCODING in org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$ParsedLine.getTimestamp() bug #2 TEST Unknown bug pattern DM_DEFAULT_ENCODING in org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(Configuration, String[]) bug #2 already existing in code. just included in patch file with no changes. And test case failures are not because of this patch. Test failures to be addressed as part of HBASE-5608
        Hide
        Laxman added a comment -

        Final patch for commit to trunk.
        Changes from previous patch
        1) Minor improvements to getTimestamp (Readability).
        2) Find bug - Default encoding - corrected using Base64 utility

        Show
        Laxman added a comment - Final patch for commit to trunk. Changes from previous patch 1) Minor improvements to getTimestamp (Readability). 2) Find bug - Default encoding - corrected using Base64 utility
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520096/HBASE-5564_trunk.3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1319//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1319//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1319//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520096/HBASE-5564_trunk.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1319//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1319//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1319//console This message is automatically generated.
        Hide
        stack added a comment -

        Patch looks good. Is this right:

        +        return Long.parseLong(Base64.encodeBytes(lineBytes,
        +            getColumnOffset(timestampKeyColumnIndex), getColumnLength(timestampKeyColumnIndex)));
        

        As I read it, encode some passed bytes into a base64 String and then try to parse it as a long (it doesn't look like parseLong can interpret base64'd longs)? Am I reading it wrong?

        I was going to mark this an incompatible change but thinking on it, setting timestamp for the MR job once rather than per mapper seems like a bug fix.

        Please write a bit of a release note at least explaining the changed behavior.

        If the above is right and I'm just reading it wrong, will commit. Let me know. Thanks Laxman.

        Show
        stack added a comment - Patch looks good. Is this right: + return Long .parseLong(Base64.encodeBytes(lineBytes, + getColumnOffset(timestampKeyColumnIndex), getColumnLength(timestampKeyColumnIndex))); As I read it, encode some passed bytes into a base64 String and then try to parse it as a long (it doesn't look like parseLong can interpret base64'd longs)? Am I reading it wrong? I was going to mark this an incompatible change but thinking on it, setting timestamp for the MR job once rather than per mapper seems like a bug fix. Please write a bit of a release note at least explaining the changed behavior. If the above is right and I'm just reading it wrong, will commit. Let me know. Thanks Laxman.
        Hide
        Laxman added a comment -

        Its my mistake stack. While fixing findbug, I overlooked Base64 behavior. I was expecting UTF-8 encoding from this utiliny. Thanks for pointing it out. I will fix this.

        Will also add some unit tests for parsing the timestamps properly.
        Thanks stack for pointing out the problem.

        Show
        Laxman added a comment - Its my mistake stack. While fixing findbug, I overlooked Base64 behavior. I was expecting UTF-8 encoding from this utiliny. Thanks for pointing it out. I will fix this. Will also add some unit tests for parsing the timestamps properly. Thanks stack for pointing out the problem.
        Hide
        Laxman added a comment -

        Another problem found in my testing. Invalid timestamp is not respecting skip.bad.lines configuration.
        I will update the patch for this as well. Adding some unit tests too.

        Show
        Laxman added a comment - Another problem found in my testing. Invalid timestamp is not respecting skip.bad.lines configuration. I will update the patch for this as well. Adding some unit tests too.
        Hide
        Laxman added a comment -

        Attached the final patch for review and commit.

        Changes from previous patch
        1) Encoding issue
        2) Proper handling for bad records (with invalid timestamp)
        3) New unit tests to test the parser (with valid & invalid timestamp)

        Note: QA may report 2 new findbugs. As explained earlier, these findings are due to usage of default encoding (String.getBytes, new String) which is inline with the existing behavior.

        Show
        Laxman added a comment - Attached the final patch for review and commit. Changes from previous patch 1) Encoding issue 2) Proper handling for bad records (with invalid timestamp) 3) New unit tests to test the parser (with valid & invalid timestamp) Note: QA may report 2 new findbugs. As explained earlier, these findings are due to usage of default encoding (String.getBytes, new String) which is inline with the existing behavior.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520255/HBASE-5564_trunk.4_final.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1326//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1326//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1326//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520255/HBASE-5564_trunk.4_final.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1326//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1326//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1326//console This message is automatically generated.
        Hide
        stack added a comment -

        Same as v4 but uses Bytes to try and get rid of the findbug warnings (Laxman, you have probably noticed our new 'sensitivity' to the findbug output... you did not introduce these warnings, they were in the original code – but let me try and get rid of them w/ this v5 ... thanks).

        Show
        stack added a comment - Same as v4 but uses Bytes to try and get rid of the findbug warnings (Laxman, you have probably noticed our new 'sensitivity' to the findbug output... you did not introduce these warnings, they were in the original code – but let me try and get rid of them w/ this v5 ... thanks).
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520371/5564v5.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1336//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1336//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1336//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520371/5564v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1336//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1336//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1336//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        +1 on v5.

        Thanks for the patch Lakshman and Stack.
        @Stack
        So any new patches that we give should not have any findbugs even if in the old existing code? Ok i will take care of this and ensure people submitting patches over here also do that. Thanks.

        Show
        ramkrishna.s.vasudevan added a comment - +1 on v5. Thanks for the patch Lakshman and Stack. @Stack So any new patches that we give should not have any findbugs even if in the old existing code? Ok i will take care of this and ensure people submitting patches over here also do that. Thanks.
        Hide
        Laxman added a comment -

        @stack, thanks for your review and clearing the findbugs.
        I was avoiding these changes as these are unrelated to this JIRA.

        @ram, thanks for reviewing the patch.

        Show
        Laxman added a comment - @stack, thanks for your review and clearing the findbugs. I was avoiding these changes as these are unrelated to this JIRA. @ram, thanks for reviewing the patch.
        Hide
        Uma Maheswara Rao G added a comment -

        Also don't forget to update the count in test-patch.properties according to the present count if we fix any existing findbugs.

        +Uma

        Show
        Uma Maheswara Rao G added a comment - Also don't forget to update the count in test-patch.properties according to the present count if we fix any existing findbugs. +Uma
        Hide
        stack added a comment -

        Committed to trunk. Thanks for the patch Laxman. Thanks for the reminder on updating the count Uma. It seems that my minor addition only stopped the count rising so I didn't have to change the findbugs count (the test build was seeing two new findbug warnings when in fact there were none – a variable name change was making it think the findbugs count had gone up).

        Show
        stack added a comment - Committed to trunk. Thanks for the patch Laxman. Thanks for the reminder on updating the count Uma. It seems that my minor addition only stopped the count rising so I didn't have to change the findbugs count (the test build was seeing two new findbug warnings when in fact there were none – a variable name change was making it think the findbugs count had gone up).
        Hide
        Jonathan Hsieh added a comment -

        maybe not worry about find bugs for normal patches? (ideally it does go up though) the find bugs number isn't the focus of this patch.

        Sent from my iPhone

        Show
        Jonathan Hsieh added a comment - maybe not worry about find bugs for normal patches? (ideally it does go up though) the find bugs number isn't the focus of this patch. Sent from my iPhone
        Hide
        Jonathan Hsieh added a comment -

        meant to say "ideally it does not go up". I think stack's action (he didn't lower findbugs number on normal patch) captured the same idea.

        Show
        Jonathan Hsieh added a comment - meant to say "ideally it does not go up". I think stack's action (he didn't lower findbugs number on normal patch) captured the same idea.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #154 (See https://builds.apache.org/job/HBase-TRUNK-security/154/)
        HBASE-5564 Bulkload is discarding duplicate records (Revision 1306907)

        Result = FAILURE
        stack :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #154 (See https://builds.apache.org/job/HBase-TRUNK-security/154/ ) HBASE-5564 Bulkload is discarding duplicate records (Revision 1306907) Result = FAILURE stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Laxman added a comment -

        Thanks for the commit stack.

        Show
        Laxman added a comment - Thanks for the commit stack.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2698 (See https://builds.apache.org/job/HBase-TRUNK/2698/)
        HBASE-5564 Bulkload is discarding duplicate records (Revision 1306907)

        Result = FAILURE
        stack :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2698 (See https://builds.apache.org/job/HBase-TRUNK/2698/ ) HBASE-5564 Bulkload is discarding duplicate records (Revision 1306907) Result = FAILURE stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Ted Yu added a comment -

        By reverting the patch applied to trunk, TestImportTsv#testMROnTableWithCustomMapper passes.

        Show
        Ted Yu added a comment - By reverting the patch applied to trunk, TestImportTsv#testMROnTableWithCustomMapper passes.
        Hide
        stack added a comment -

        Thanks Ted. I reverted the patch for now. Laxman, mind taking a looksee at the failures Ted found in TestImportTsv#testMROnTableWithCustomMapper?

        Show
        stack added a comment - Thanks Ted. I reverted the patch for now. Laxman, mind taking a looksee at the failures Ted found in TestImportTsv#testMROnTableWithCustomMapper?
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2701 (See https://builds.apache.org/job/HBase-TRUNK/2701/)
        HBASE-5564 Bulkload is discarding duplicate records (Revision 1307629)

        Result = SUCCESS
        stack :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2701 (See https://builds.apache.org/job/HBase-TRUNK/2701/ ) HBASE-5564 Bulkload is discarding duplicate records (Revision 1307629) Result = SUCCESS stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #155 (See https://builds.apache.org/job/HBase-TRUNK-security/155/)
        HBASE-5564 Bulkload is discarding duplicate records (Revision 1307629)

        Result = SUCCESS
        stack :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #155 (See https://builds.apache.org/job/HBase-TRUNK-security/155/ ) HBASE-5564 Bulkload is discarding duplicate records (Revision 1307629) Result = SUCCESS stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Laxman added a comment -

        Yes Stack. I will take a look. Changes in this patch are in Default Mapper. IMO these changes shouldn't cause failures in custom mapper.

        Show
        Laxman added a comment - Yes Stack. I will take a look. Changes in this patch are in Default Mapper. IMO these changes shouldn't cause failures in custom mapper.
        Hide
        stack added a comment -

        @Laxman Any luck?

        Show
        stack added a comment - @Laxman Any luck?
        Hide
        ramkrishna.s.vasudevan added a comment -

        We got the problem. It was because there was a space created in the latest patch in the testcase
        '" = org.apache.hadoop.hbase.mapreduce.TsvImporterCustomTestMapper",'. There should not be any space before and after '='.

        Will rebase the patch so that it can be recommitted.

        Show
        ramkrishna.s.vasudevan added a comment - We got the problem. It was because there was a space created in the latest patch in the testcase '" = org.apache.hadoop.hbase.mapreduce.TsvImporterCustomTestMapper",'. There should not be any space before and after '='. Will rebase the patch so that it can be recommitted.
        Hide
        ramkrishna.s.vasudevan added a comment -

        New patch for trunk. This time the testcases should run. Pls review and provide your comments.

        Show
        ramkrishna.s.vasudevan added a comment - New patch for trunk. This time the testcases should run. Pls review and provide your comments.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12531698/HBASE-5564.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531698/HBASE-5564.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2136//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        All the tests are passing.. Will integrate tomorrow if there are no objections.

        Show
        ramkrishna.s.vasudevan added a comment - All the tests are passing.. Will integrate tomorrow if there are no objections.
        Hide
        Ted Yu added a comment -

        Minor comment:

        +          throw new BadTsvLineException("Invalid timestamp");
        

        Can the timestamp string be included ?

        Show
        Ted Yu added a comment - Minor comment: + throw new BadTsvLineException( "Invalid timestamp" ); Can the timestamp string be included ?
        Hide
        ramkrishna.s.vasudevan added a comment -

        Ok.. I will make that change and reupload the patch..Thanks Ted.

        Show
        ramkrishna.s.vasudevan added a comment - Ok.. I will make that change and reupload the patch..Thanks Ted.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Updated patch addressing Ted's comments. This what am planning to commit if there is no objection.

        Show
        ramkrishna.s.vasudevan added a comment - Updated patch addressing Ted's comments. This what am planning to commit if there is no objection.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12531854/HBASE-5564_1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.coprocessor.TestClassLoading

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531854/HBASE-5564_1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestClassLoading Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2153//console This message is automatically generated.
        Hide
        stack added a comment -

        +1 on commit.

        Show
        stack added a comment - +1 on commit.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Committed to trunk.
        Thanks for the patch Laxman.
        Thanks for the review Stack, Ted, Lars, Todd, Jesse and Anoop.

        Show
        ramkrishna.s.vasudevan added a comment - Committed to trunk. Thanks for the patch Laxman. Thanks for the review Stack, Ted, Lars, Todd, Jesse and Anoop.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #3030 (See https://builds.apache.org/job/HBase-TRUNK/3030/)
        HBASE-5564 Bulkload is discarding duplicate records

        Submitted by:Laxman
        Reviewed by:iStack, Ted, Ram (Revision 1350691)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #3030 (See https://builds.apache.org/job/HBase-TRUNK/3030/ ) HBASE-5564 Bulkload is discarding duplicate records Submitted by:Laxman Reviewed by:iStack, Ted, Ram (Revision 1350691) Result = FAILURE ramkrishna : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #55 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/55/)
        HBASE-5564 Bulkload is discarding duplicate records

        Submitted by:Laxman
        Reviewed by:iStack, Ted, Ram (Revision 1350691)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #55 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/55/ ) HBASE-5564 Bulkload is discarding duplicate records Submitted by:Laxman Reviewed by:iStack, Ted, Ram (Revision 1350691) Result = FAILURE ramkrishna : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #834 (See https://builds.apache.org/job/HBase-0.94/834/)
        HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842)

        Result = ABORTED
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #834 (See https://builds.apache.org/job/HBase-0.94/834/ ) HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842) Result = ABORTED tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security #109 (See https://builds.apache.org/job/HBase-0.94-security/109/)
        HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security #109 (See https://builds.apache.org/job/HBase-0.94-security/109/ ) HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #835 (See https://builds.apache.org/job/HBase-0.94/835/)
        HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #835 (See https://builds.apache.org/job/HBase-0.94/835/ ) HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security-on-Hadoop-23 #12 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/12/)
        HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #12 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/12/ ) HBASE-7793 Port HBASE-5564 Bulkload is discarding duplicate records to 0.94 (Ted Yu) (Revision 1443842) Result = FAILURE tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
        Hide
        stack added a comment -

        Marking closed.

        Show
        stack added a comment - Marking closed.

          People

          • Assignee:
            Laxman
            Reporter:
            Laxman
          • Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development