Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2062

make it possible for HBase sink to deposit event headers into corresponding column qualifiers

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.1
    • Fix Version/s: 1.4.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      It would be nice for the HBase sink to parse the event headers and deposit them into the corresponding column qualifiers of a given column family.

        Activity

        Hide
        hudson Hudson added a comment -

        Integrated in flume-trunk #417 (See https://builds.apache.org/job/flume-trunk/417/)
        FLUME-2062. Make it possible for HBase sink to deposit event headers into corresponding column qualifiers (Revision eefefa941a60c0982f0957804be0cafb4d83e46e)

        Result = SUCCESS
        hshreedharan : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=eefefa941a60c0982f0957804be0cafb4d83e46e
        Files :

        • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.java
        • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.java
        Show
        hudson Hudson added a comment - Integrated in flume-trunk #417 (See https://builds.apache.org/job/flume-trunk/417/ ) FLUME-2062 . Make it possible for HBase sink to deposit event headers into corresponding column qualifiers (Revision eefefa941a60c0982f0957804be0cafb4d83e46e) Result = SUCCESS hshreedharan : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=eefefa941a60c0982f0957804be0cafb4d83e46e Files : flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/RegexHbaseEventSerializer.java flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestRegexHbaseEventSerializer.java
        Hide
        hshreedharan Hari Shreedharan added a comment -

        I updated the configuration param to depositHeaders to match the current camel case format for config params and committed this, rev: eefefa941a60c0982f0957804be0cafb4d83e46e. Thanks Roman!

        Show
        hshreedharan Hari Shreedharan added a comment - I updated the configuration param to depositHeaders to match the current camel case format for config params and committed this, rev: eefefa941a60c0982f0957804be0cafb4d83e46e. Thanks Roman!
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Seems like I was wrong. An empty map seems to be initialized anyway, it is replaced only if headers is not null. So we are good.

        so +1 on the patch. Running tests now.

        Show
        hshreedharan Hari Shreedharan added a comment - Seems like I was wrong. An empty map seems to be initialized anyway, it is replaced only if headers is not null. So we are good. so +1 on the patch. Running tests now.
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Roman - sorry for not catching this one earlier, but it looks like you need to test whether headers is null before trying to copy the headers into the put. Flume does allow the headers map to be null (see EventBuilder#withBody methods), so this code:

              if (depositHeaders) {
                for (Map.Entry<String, String> entry : headers.entrySet()) {
                  put.add(cf, entry.getKey().getBytes(Charsets.UTF_8), entry.getValue().getBytes(Charsets.UTF_8));
                }
              }
        

        should be something like:

              if (depositHeaders && headers != null) {
                for (Map.Entry<String, String> entry : headers.entrySet()) {
                  put.add(cf, entry.getKey().getBytes(Charsets.UTF_8), entry.getValue().getBytes(Charsets.UTF_8));
                }
              }
        
        Show
        hshreedharan Hari Shreedharan added a comment - Roman - sorry for not catching this one earlier, but it looks like you need to test whether headers is null before trying to copy the headers into the put. Flume does allow the headers map to be null (see EventBuilder#withBody methods), so this code: if (depositHeaders) { for (Map.Entry< String , String > entry : headers.entrySet()) { put.add(cf, entry.getKey().getBytes(Charsets.UTF_8), entry.getValue().getBytes(Charsets.UTF_8)); } } should be something like: if (depositHeaders && headers != null ) { for (Map.Entry< String , String > entry : headers.entrySet()) { put.add(cf, entry.getKey().getBytes(Charsets.UTF_8), entry.getValue().getBytes(Charsets.UTF_8)); } }
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Yep, that makes sense. So I will go ahead and run tests and commit this for now, and we can follow up with another jira as you suggest.

        Show
        hshreedharan Hari Shreedharan added a comment - Yep, that makes sense. So I will go ahead and run tests and commit this for now, and we can follow up with another jira as you suggest.
        Hide
        rvs Roman Shaposhnik added a comment -

        Hari Shreedharan good point – I've noticed that the existing code hardcodes UTF8 and took it from there.

        However, now that you've suggested it – how about we make it tweakable not just for the patch I've provided, but in general for everything in RegexHbaseEventSerializer (with the default being Charset.defaultCharset()) ? If that sounds like a good idea – I'd be more than happy to file a new JIRA and provide a pretty trivial patch ASAP.

        Show
        rvs Roman Shaposhnik added a comment - Hari Shreedharan good point – I've noticed that the existing code hardcodes UTF8 and took it from there. However, now that you've suggested it – how about we make it tweakable not just for the patch I've provided, but in general for everything in RegexHbaseEventSerializer (with the default being Charset.defaultCharset()) ? If that sounds like a good idea – I'd be more than happy to file a new JIRA and provide a pretty trivial patch ASAP.
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Hi Roman,

        This looks good. One question though - do you think it makes sense to make the charset configurable?

        Show
        hshreedharan Hari Shreedharan added a comment - Hi Roman, This looks good. One question though - do you think it makes sense to make the charset configurable?
        Hide
        rvs Roman Shaposhnik added a comment -

        Attaching a pretty trivial patch complete with a unit test.

        Show
        rvs Roman Shaposhnik added a comment - Attaching a pretty trivial patch complete with a unit test.

          People

          • Assignee:
            rvs Roman Shaposhnik
            Reporter:
            rvs Roman Shaposhnik
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development