Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.99.0
    • Component/s: mapreduce
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Import with this fix supports

      a) Filtering of the row using the Filter#filterRowKey(byte[] buffer, int offset, int length).

      b) Accepts durability parameter (Ex: -Dimport.wal.durability=SKIP_WAL ) while importing the data into HBase. If the data doesn't need to be replicated to the DR cluster or if the same import job would be run on the DR cluster, consider using SKIP_WAL durability for performance.
      Show
      Import with this fix supports a) Filtering of the row using the Filter#filterRowKey(byte[] buffer, int offset, int length). b) Accepts durability parameter (Ex: -Dimport.wal.durability=SKIP_WAL ) while importing the data into HBase. If the data doesn't need to be replicated to the DR cluster or if the same import job would be run on the DR cluster, consider using SKIP_WAL durability for performance.

      Description

      Following improvements can be made to the Import logic

      a) Make the import extensible (i.e., remove the filter from being a static member of Import and make it an instance variable of the mapper, make the mappers or variables of interest protected. )

      b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to filter the data of an organization based on the row key or using filters like PrefixFilter which filter the data in filterRowKey method rather than the filterKeyValue method). The existing test case in TestImportExport#testWithFilter works with this assumption but is so far successful because there is only one row inserted into the table.

      c) Provide an option to specify the durability during the import (Specifying the Durability as SKIP_WAL would improve the performance of restore considerably.) Lars Hofhansl suggested that this should be a parameter to the import.

      d) Some minor refactoring to avoid building a comma separated string for the filter args.

        Attachments

        1. HBASE-10416-rev1.patch
          20 kB
          Vasu Mariyala
        2. HBASE-10416.patch
          19 kB
          Vasu Mariyala

          Activity

            People

            • Assignee:
              vmariyala Vasu Mariyala
              Reporter:
              vasu.mariyala@gmail.com Vasu Mariyala
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: