Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21098

Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 1.4.8, 2.1.1
    • Fix Version/s: 3.0.0, 2.2.0, 1.4.9
    • Component/s: None
    • Labels:
    • Release Note:
      Hide
      It is recommended to place the working directory on-cluster on HDFS as doing so has shown a strong performance increase due to data locality. It is important to note that the working directory should not overlap with any existing directories as the working directory will be cleaned out during the snapshot process. Beyond that, any well-named directory on HDFS should be sufficient.
      Show
      It is recommended to place the working directory on-cluster on HDFS as doing so has shown a strong performance increase due to data locality. It is important to note that the working directory should not overlap with any existing directories as the working directory will be cleaned out during the snapshot process. Beyond that, any well-named directory on HDFS should be sufficient.

      Description

      When using Apache HBase, the snapshot feature can be used to make a point in time recovery. To do this, HBase creates a manifest of all the files in all of the Regions so that those files can be referenced again when a user restores a snapshot. With HBase's S3 storage mode, developers can store their data off-cluster on Amazon S3. However, utilizing S3 as a file system is inefficient in some operations, namely renames. Most Hadoop ecosystem applications use an atomic rename as a method of committing data. However, with S3, a rename is a separate copy and then a delete of every file which is no longer atomic and, in fact, quite costly. In addition, puts and deletes on S3 have latency issues that traditional filesystems do not encounter when manipulating the region snapshots to consolidate into a single manifest. When HBase on S3 users have a significant amount of regions, puts, deletes, and renames (the final commit stage of the snapshot) become the bottleneck causing snapshots to take many minutes or even hours to complete.

      The purpose of this patch is to increase the overall performance of snapshots while utilizing HBase on S3 through the use of a temporary directory for the snapshots that exists on a traditional filesystem like HDFS to circumvent the bottlenecks.

        Attachments

        1. HBASE-21098.branch-1.001.patch
          84 kB
          Zach York
        2. HBASE-21098.branch-1.002.patch
          85 kB
          Zach York
        3. HBASE-21098.master.001.patch
          76 kB
          Tyler Mi
        4. HBASE-21098.master.002.patch
          76 kB
          Tyler Mi
        5. HBASE-21098.master.003.patch
          76 kB
          Tyler Mi
        6. HBASE-21098.master.004.patch
          75 kB
          Tyler Mi
        7. HBASE-21098.master.005.patch
          75 kB
          Tyler Mi
        8. HBASE-21098.master.006.patch
          75 kB
          Tyler Mi
        9. HBASE-21098.master.007.patch
          78 kB
          Tyler Mi
        10. HBASE-21098.master.008.patch
          77 kB
          Tyler Mi
        11. HBASE-21098.master.009.patch
          77 kB
          Tyler Mi
        12. HBASE-21098.master.010.patch
          79 kB
          Tyler Mi
        13. HBASE-21098.master.011.patch
          78 kB
          Tyler Mi
        14. HBASE-21098.master.012.patch
          79 kB
          Tyler Mi
        15. HBASE-21098.master.013.patch
          79 kB
          Tyler Mi

          Issue Links

            Activity

              People

              • Assignee:
                mtylr Tyler Mi
                Reporter:
                mtylr Tyler Mi
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: