Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-11409

Add more flexibility for input directory structure to LoadIncrementalHFiles

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha-1
    • Fix Version/s: 1.4.1, 2.0.0-beta-1, 2.0.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      Allows for users to bulk load entire tables from hdfs by specifying the parameter -loadTable. This allows you to pass in a table level directory and have all regions column families bulk loaded, if you do not specify the -loadTable parameter LoadIncrementalHFiles will work as before. Note: you must have a pre-created table to run with -loadTable it will not create one for you.
      Show
      Allows for users to bulk load entire tables from hdfs by specifying the parameter -loadTable. This allows you to pass in a table level directory and have all regions column families bulk loaded, if you do not specify the -loadTable parameter LoadIncrementalHFiles will work as before. Note: you must have a pre-created table to run with -loadTable it will not create one for you.

      Description

      Use case:

      We were trying to combine two very large tables into a single table. Thus we ran jobs in one datacenter that populated certain column families and another datacenter which populated other column families. Took a snapshot and exported them to their respective datacenters. Wanted to simply take the hdfs restored snapshot and use LoadIncremental to merge the data.

      It would be nice to add support where we could run LoadIncremental on a directory where the depth of store files is something other than two (current behavior).

      With snapshots it would be nice if you could pass a restored hdfs snapshot's directory and have the tool run.

      I am attaching a patch where I parameterize the bulkLoad timeout as well as the default store file depth.

        Attachments

        1. HBASE-11409.v1.patch
          10 kB
          churro morales
        2. HBASE-11409.v2.patch
          11 kB
          churro morales
        3. HBASE-11409.v3.patch
          11 kB
          churro morales
        4. HBASE-11409.v4.patch
          12 kB
          churro morales
        5. HBASE-11409.v5.patch
          13 kB
          churro morales
        6. HBASE-11409.v6.branch-1.patch
          14 kB
          churro morales

          Issue Links

            Activity

              People

              • Assignee:
                churromorales churro morales
                Reporter:
                churromorales churro morales
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: