[HBASE-4435] Add Group By functionality using Coprocessors - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Abandoned
Affects Version/s: None
Fix Version/s: None
Component/s: Coprocessors
Labels:
- by
- coprocessors
- group
- hbase

Description

Adds in a Group By -like functionality to HBase, using the Coprocessor framework.

It provides the ability to group the result set on one or more columns (groupBy families). It computes statistics (max, min, sum, count, sum of squares, number missing) for a second column, called the stats column.

To use, I've provided two implementations.

1. In the first, you specify a single group-by column and a stats field:

statsMap = gbc.getStats(tableName, scan, groupByFamily, groupByQualifier, statsFamily, statsQualifier, statsFieldColumnInterpreter);

The result is a map with the Group By column value (as a String) to a GroupByStatsValues object. The GroupByStatsValues object has max,min,sum etc. of the stats column for that group.

2. The second implementation allows you to specify a list of group-by columns and a stats field. The List of group-by columns is expected to contain lists of

{column family, qualifier}

pairs.

statsMap = gbc.getStats(tableName, scan, listOfGroupByColumns, statsFamily, statsQualifier, statsFieldColumnInterpreter);

The GroupByStatsValues code is adapted from the Solr Stats component.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBase-4435.patch
19/Sep/11 15:12
29 kB
Nichole Treadway
HBASE-4435-v2.patch
17/Oct/12 19:01
52 kB
Aaron Tokhy

Issue Links

relates to

HBASE-1512 Coprocessors: Support aggregate functions

Closed

HBASE-7474 Endpoint Implementation to support Scans with Sorting of Rows based on column values(similar to "order by" clause of RDBMS)

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Nichole Treadway

Votes:: 2 Vote for this issue

Watchers:: 16 Start watching this issue

Dates

Created:: 19/Sep/11 14:03

Updated:: 12/Jun/22 19:31

Resolved:: 11/Jun/22 18:46