HBase
  1. HBase
  2. HBASE-10615

Make LoadIncrementalHFiles skip reference files

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.96.0
    • Fix Version/s: 0.99.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      There is use base that the source of hfiles for LoadIncrementalHFiles can be a FileSystem copy-out/backup of HBase table or archive hfiles. For example,
      1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or archive dir.
      2. ExportSnapshot

      It is possible that there are reference files in the family dir in these cases.
      We have such use cases, where trying to load back into HBase, we'll get

      Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
              at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
              at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
              at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
              at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
              at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
              at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
              at java.util.concurrent.FutureTask.run(FutureTask.java:149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
              at java.lang.Thread.run(Thread.java:738)
      Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 16715777 (expected to be between 2 and 2)
              at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
              at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
              at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
      

      It is desirable and safe to skip these reference files since they don't contain any real data for bulk load purpose.

      1. HBASE-10615-trunk-v3.patch
        3 kB
        Jerry He
      2. HBASE-10615-trunk-v2.patch
        3 kB
        Jerry He
      3. HBASE-10615-trunk.patch
        2 kB
        Jerry He

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Jerry He
            Reporter:
            Jerry He
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development