Accumulo
  1. Accumulo
  2. ACCUMULO-2539

SummingCombiner does not adhere to Combiner#COLUMNS_OPTION

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.4, 1.5.1
    • Fix Version/s: 1.4.5, 1.5.2, 1.6.0
    • Component/s: client
    • Labels:
      None

      Description

      In testing out the SummingCombiner, I set the Combiner#COLUMNS_OPTION which led me to believe that the SummingCombiner would limit columns to the set that is provided as an option through the IteratorSetting.

      In practice, this was not the case as Combiner#findTop isn't used, and thus that column list is not adhered to. I can get the expected functionality out of SummingCombiner by calling ScannerBase#fetchColumn or ScannerBase#fetchColumnFamily.

        Activity

        Hide
        ASF subversion and git services added a comment -

        Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/master from Christopher Tubbs
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ]

        Fix recent warnings introduced by several tickets

        ACCUMULO-2005
        ACCUMULO-2182
        ACCUMULO-2224
        ACCUMULO-2234
        ACCUMULO-2539

        Show
        ASF subversion and git services added a comment - Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/master from Christopher Tubbs [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ] Fix recent warnings introduced by several tickets ACCUMULO-2005 ACCUMULO-2182 ACCUMULO-2224 ACCUMULO-2234 ACCUMULO-2539
        Hide
        ASF subversion and git services added a comment -

        Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/1.6.0-SNAPSHOT from Christopher Tubbs
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ]

        Fix recent warnings introduced by several tickets

        ACCUMULO-2005
        ACCUMULO-2182
        ACCUMULO-2224
        ACCUMULO-2234
        ACCUMULO-2539

        Show
        ASF subversion and git services added a comment - Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/1.6.0-SNAPSHOT from Christopher Tubbs [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ] Fix recent warnings introduced by several tickets ACCUMULO-2005 ACCUMULO-2182 ACCUMULO-2224 ACCUMULO-2234 ACCUMULO-2539
        Hide
        ASF subversion and git services added a comment -

        Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/1.5.2-SNAPSHOT from Christopher Tubbs
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ]

        Fix recent warnings introduced by several tickets

        ACCUMULO-2005
        ACCUMULO-2182
        ACCUMULO-2224
        ACCUMULO-2234
        ACCUMULO-2539

        Show
        ASF subversion and git services added a comment - Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/1.5.2-SNAPSHOT from Christopher Tubbs [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ] Fix recent warnings introduced by several tickets ACCUMULO-2005 ACCUMULO-2182 ACCUMULO-2224 ACCUMULO-2234 ACCUMULO-2539
        Hide
        ASF subversion and git services added a comment -

        Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/1.4.5-SNAPSHOT from Christopher Tubbs
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ]

        Fix recent warnings introduced by several tickets

        ACCUMULO-2005
        ACCUMULO-2182
        ACCUMULO-2224
        ACCUMULO-2234
        ACCUMULO-2539

        Show
        ASF subversion and git services added a comment - Commit 4ee0f6494ea0d2265d930bce962ebcc3c7dbfe5c in accumulo's branch refs/heads/1.4.5-SNAPSHOT from Christopher Tubbs [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4ee0f64 ] Fix recent warnings introduced by several tickets ACCUMULO-2005 ACCUMULO-2182 ACCUMULO-2224 ACCUMULO-2234 ACCUMULO-2539
        Hide
        ASF subversion and git services added a comment -

        Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ]

        ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation

        Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns
        are only combined within a row. Only multiple versions for otherwise equal Keys are combined.

        Show
        ASF subversion and git services added a comment - Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ] ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns are only combined within a row. Only multiple versions for otherwise equal Keys are combined.
        Hide
        ASF subversion and git services added a comment -

        Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/1.6.0-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ]

        ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation

        Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns
        are only combined within a row. Only multiple versions for otherwise equal Keys are combined.

        Show
        ASF subversion and git services added a comment - Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/1.6.0-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ] ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns are only combined within a row. Only multiple versions for otherwise equal Keys are combined.
        Hide
        ASF subversion and git services added a comment -

        Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/1.5.2-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ]

        ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation

        Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns
        are only combined within a row. Only multiple versions for otherwise equal Keys are combined.

        Show
        ASF subversion and git services added a comment - Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/1.5.2-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ] ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns are only combined within a row. Only multiple versions for otherwise equal Keys are combined.
        Hide
        ASF subversion and git services added a comment -

        Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/1.4.5-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ]

        ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation

        Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns
        are only combined within a row. Only multiple versions for otherwise equal Keys are combined.

        Show
        ASF subversion and git services added a comment - Commit 88761e05bce7e71c89b1a6033ba74edfcd8db9a6 in accumulo's branch refs/heads/1.4.5-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=88761e0 ] ACCUMULO-2539 Expand javadoc for Combiner to be more specific about combination implementation Explicitly note that, most times, fetchColumnFamily or fetchColumns is desired and that columns are only combined within a row. Only multiple versions for otherwise equal Keys are combined.
        Hide
        Josh Elser added a comment - - edited

        Thanks for the feedback, Billie Rinaldi. I'll fix that stuff before committing.

        Edit: Regarding the imports, the new imports are from the Javadoc references. Eclipse isn't showing me any compiler warnings.

        Show
        Josh Elser added a comment - - edited Thanks for the feedback, Billie Rinaldi . I'll fix that stuff before committing. Edit: Regarding the imports, the new imports are from the Javadoc references. Eclipse isn't showing me any compiler warnings.
        Hide
        Billie Rinaldi added a comment -

        Combiner never combines values across keys with different columns. It combines values for keys that differ in timestamp only. So, if you specify a column family, every qualifier in that family will be combined individually, they won't be combined together into a single value for the family. Same with the all columns option – it doesn't combine all the columns into a single value, it's just an easier way to configure the same combiner for each individual column in the table. (There are also a couple of unused imports in the patch.)

        Show
        Billie Rinaldi added a comment - Combiner never combines values across keys with different columns. It combines values for keys that differ in timestamp only. So, if you specify a column family, every qualifier in that family will be combined individually, they won't be combined together into a single value for the family. Same with the all columns option – it doesn't combine all the columns into a single value, it's just an easier way to configure the same combiner for each individual column in the table. (There are also a couple of unused imports in the patch.)
        Hide
        Josh Elser added a comment -

        Billie Rinaldi how do these updates look to you?

        Show
        Josh Elser added a comment - Billie Rinaldi how do these updates look to you?
        Hide
        Billie Rinaldi added a comment -

        maybe I can write some clearer documentation here then

        +1

        Show
        Billie Rinaldi added a comment - maybe I can write some clearer documentation here then +1
        Hide
        Josh Elser added a comment -

        Billie Rinaldi Oh, I think I see. That option controls which columns will get combined together, but the extra columns that are fetched would still be returned, non-aggregated. I saw the option and expected it to do something that it wasn't – maybe I can write some clearer documentation here then.

        Show
        Josh Elser added a comment - Billie Rinaldi Oh, I think I see. That option controls which columns will get combined together, but the extra columns that are fetched would still be returned, non-aggregated. I saw the option and expected it to do something that it wasn't – maybe I can write some clearer documentation here then.
        Hide
        Billie Rinaldi added a comment -

        which led me to believe that the SummingCombiner would limit columns to the set that is provided as an option through the IteratorSetting

        This is not the case. The columns option is independent of which columns you are fetching. If you fetch the columns specified in the columns option, those columns will be combined, while other fetched columns will not.

        Show
        Billie Rinaldi added a comment - which led me to believe that the SummingCombiner would limit columns to the set that is provided as an option through the IteratorSetting This is not the case. The columns option is independent of which columns you are fetching. If you fetch the columns specified in the columns option, those columns will be combined, while other fetched columns will not.
        Hide
        Josh Elser added a comment -

        Actually, looking at things, I'm not entirely sure where the bug is but something isn't quite right. The following does not work:

        BatchScanner bs = getConnector().createBatchScanner(getTableName(), new Authorizations(), 2);
        bs.setRanges(Collections.singleton(new Range()));
        IteratorSetting combiner = new IteratorSetting(50, "combiner", SummingCombiner.class);
        LongCombiner.setEncodingType(combiner, LongCombiner.Type.VARLEN);
        LongCombiner.setColumns(combiner, Collections.singletonList(new Column(new Text("count"))));
        bs.addScanIterator(combiner);
        

        Adding in the following will cause it to work:

        bs.fetchColumnFamily(new Text("count"));
        
        Show
        Josh Elser added a comment - Actually, looking at things, I'm not entirely sure where the bug is but something isn't quite right. The following does not work: BatchScanner bs = getConnector().createBatchScanner(getTableName(), new Authorizations(), 2); bs.setRanges(Collections.singleton( new Range())); IteratorSetting combiner = new IteratorSetting(50, "combiner" , SummingCombiner.class); LongCombiner.setEncodingType(combiner, LongCombiner.Type.VARLEN); LongCombiner.setColumns(combiner, Collections.singletonList( new Column( new Text( "count" )))); bs.addScanIterator(combiner); Adding in the following will cause it to work: bs.fetchColumnFamily( new Text( "count" ));
        Hide
        Sean Busbey added a comment -

        pushed to 1.6.1

        Show
        Sean Busbey added a comment - pushed to 1.6.1
        Hide
        Josh Elser added a comment -

        Marked this as fix for 1.6.0. I'm ok with pushing this off to 1.6.1 if that's desirable.

        Show
        Josh Elser added a comment - Marked this as fix for 1.6.0. I'm ok with pushing this off to 1.6.1 if that's desirable.

          People

          • Assignee:
            Josh Elser
            Reporter:
            Josh Elser
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development