Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15850

CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

    XMLWordPrintableJSON

Details

    Description

      I was investigating test failure of TestIncrementalBackupWithBulkLoad from hbase against hadoop 3.1.1

      hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:

              LOG.debug("creating input listing " + listing + " , totalRecords=" + totalRecords);
              cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
              cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, totalRecords);
      

      For the test case, two bulk loaded hfiles are in the listing:

      2018-10-13 14:09:24,123 DEBUG [Time-limited test] mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
      2018-10-13 14:09:24,125 DEBUG [Time-limited test] mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
      2018-10-13 14:09:24,125 DEBUG [Time-limited test] mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 2 files of 10242
      

      Later on, CopyCommitter#concatFileChunks would throw the following exception:

      2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): job_local1795473782_0004
      java.io.IOException: Inconsistent sequence file: current chunk file org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/       160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_ length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-   2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_ length = 5142 aclEntries = null, xAttrs = null}
        at org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
        at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
      

      The above warning shouldn't happen - the two bulk loaded hfiles are independent.

      From the contents of the two CopyListingFileStatus instances, we can see that their isSplit() return false. Otherwise the following from toString should be logged:

          if (isSplit()) {
            sb.append(", chunkOffset = ").append(this.getChunkOffset());
            sb.append(", chunkLength = ").append(this.getChunkLength());
          }
      

      From hbase side, we can specify one bulk loaded hfile per job but that defeats the purpose of using DistCp.

      Attachments

        1. testIncrementalBackupWithBulkLoad-output.txt
          1.09 MB
          Ted Yu
        2. HADOOP-15850.v2.patch
          1 kB
          Ted Yu
        3. HADOOP-15850.v3.patch
          1 kB
          Ted Yu
        4. HADOOP-15850.v4.patch
          1 kB
          Ted Yu
        5. HADOOP-15850.v5.patch
          2 kB
          Ted Yu
        6. HADOOP-15850.v6.patch
          2 kB
          Ted Yu
        7. HADOOP-15850.branch-3.0.patch
          2 kB
          Wei-Chiu Chuang

        Issue Links

          Activity

            People

              yuzhihong@gmail.com Ted Yu
              yuzhihong@gmail.com Ted Yu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: