Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15482

Provide an option to skip calculating block locations for SnapshotInputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.0.0-beta-1, 2.0.0
    • mapreduce
    • None
    • Reviewed

    Description

      When a MR job is reading from SnapshotInputFormat, it needs to calculate the splits based on the block locations in order to get best locality. However, this process may take a long time for large snapshots.

      In some setup, the computing layer, Spark, Hive or Presto could run out side of HBase cluster. In these scenarios, the block locality doesn't matter. Therefore, it will be great to have an option to skip calculating the block locations for every job. That will super useful for the Hive/Presto/Spark connectors.

      Attachments

        1. 15482.v3.txt
          16 kB
          Ted Yu
        2. HBASE-15482.master.000.patch
          13 kB
          Xiang Li
        3. HBASE-15482.master.001.patch
          17 kB
          Xiang Li
        4. HBASE-15482.master.002.patch
          17 kB
          Xiang Li
        5. HBASE-15482.master.003.patch
          17 kB
          Xiang Li

        Activity

          People

            xiangli Xiang Li
            liyin Liyin Tang
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: