Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-8084 Sundry mapreduce improvements
  3. HBASE-8074

Consolidate map-side features across mapreduce tools into a single place

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • mapreduce, Usability
    • None

    Description

      The mapreduce tools support a similar but divergent set of features for mapping over KeyValue data:

      • Export supports specifying a version-range window, application of a rowkey regex or prefix filter, and a "raw mode" that includes delete markers.
      • Import can apply an arbitrary filter and can also apply a "transform", renaming column families in the emitted KeyValues.
      • CopyTable allows specifying a version-range window, limiting to a fixed number of versions, a "raw mode", and column family transformation.
      • WALPlayer supports reading a time-range.
      • ImportTsv could incorporate a number of these features, especially the filter and transform capabilities, allowing a user to avoid implementing a custom mapper where the existing parser is sufficient, but for a slight massage of the data.

      The proposal is to create a single implementation for these features with a single configuration interface. Ideally, such an implementation would be exposed via the common utility classes as well (ie, IdentityTableMapper).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ndimiduk Nick Dimiduk
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: