Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.92.0
    • Component/s: Coprocessors
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      A coprocessor to do basic aggregating; max, min, counts, etc.

      Description

      Chatting with jgray and holstad at the kitchen table about counts, sums, and other aggregating facility, facility generally where you want to calculate some meta info on your table, it seems like it wouldn't be too hard making a filter type that could run a function server-side and return the result ONLY of the aggregation or whatever.

      For example, say you just want to count rows, currently you scan, server returns all data to client and count is done by client counting up row keys. A bunch of time and resources have been wasted returning data that we're not interested in. With this new filter type, the counting would be done server-side and then it would make up a new result that was the count only (kinda like mysql when you ask it to count, it returns a 'table' with a count column whose value is count of rows). We could have it so the count was just done per region and return that. Or we could maybe make a small change in scanner too so that it aggregated the per-region counts.

      1. 1512.zip
        17 kB
        Himanshu Vashishtha
      2. patch-1512.txt
        25 kB
        Himanshu Vashishtha
      3. patch-1512-2.txt
        46 kB
        Himanshu Vashishtha
      4. AggregateCpProtocol.java
        4 kB
        Ted Yu
      5. AggregateProtocolImpl.java
        10 kB
        Ted Yu
      6. AggregationClient.java
        13 kB
        Ted Yu
      7. ColumnInterpreter.java
        2 kB
        Ted Yu
      8. patch-1512-3.txt
        60 kB
        Himanshu Vashishtha
      9. patch-1512-4.txt
        61 kB
        Himanshu Vashishtha
      10. patch-1512-5.txt
        61 kB
        Himanshu Vashishtha
      11. patch-1512-6.txt
        61 kB
        Himanshu Vashishtha
      12. patch-1512-7.txt
        61 kB
        Himanshu Vashishtha
      13. patch-1512-8.txt
        71 kB
        Himanshu Vashishtha
      14. patch-1512-9.txt
        61 kB
        Himanshu Vashishtha
      15. addendum_1512.txt
        2 kB
        Himanshu Vashishtha

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          Current patch on HBASE-2001 implements a map reduce framework in the regionserver that can calculate aggregates. A client would kick off a map reduce process distributed over each region in the table and then do an in memory merge of the results. This is roughly what is proposed in OP but uses a different server side framework than filters.

          Show
          Andrew Purtell added a comment - Current patch on HBASE-2001 implements a map reduce framework in the regionserver that can calculate aggregates. A client would kick off a map reduce process distributed over each region in the table and then do an in memory merge of the results. This is roughly what is proposed in OP but uses a different server side framework than filters.
          Hide
          Himanshu Vashishtha added a comment -

          Hello Andrew,
          Can you please tell where should I start digging and changing stuff to make it work. I have seen your code under the patch 2001, and would like to work on coprocessors, starting from this jira.
          Will fetching codebase from github (as mentioned there) will give me the right code base...seems things have changed since then. What is the right way to do this.
          Thanks,
          Himanshu

          Show
          Himanshu Vashishtha added a comment - Hello Andrew, Can you please tell where should I start digging and changing stuff to make it work. I have seen your code under the patch 2001, and would like to work on coprocessors, starting from this jira. Will fetching codebase from github (as mentioned there) will give me the right code base...seems things have changed since then. What is the right way to do this. Thanks, Himanshu
          Hide
          Mingjie Lai added a comment -

          Hi Himanshu.

          Right now I'm doing some cleanups for coprocessor. Here is the code: http://github.com/mlai/hbase. Please use the branch – coprocessors_mlai.

          However our current objective is to utilize CP to implement role based access control(RBAC) toward 0.90. We only need Coprocessor, RegionObservor, CommandType interfaces for this purpose. So I didn't include the MapReduce and FilterInterface in the branch (neither for 0.90 I think).

          You can take a look at that branch. It can pass all HBase test cases, but we still need to improve it a little for exception handling.

          If you have interests for Mapreduce implementation, you can also refer the first patch of HBASE-2001.

          Thanks,
          Mingjie

          Show
          Mingjie Lai added a comment - Hi Himanshu. Right now I'm doing some cleanups for coprocessor. Here is the code: http://github.com/mlai/hbase . Please use the branch – coprocessors_mlai. However our current objective is to utilize CP to implement role based access control(RBAC) toward 0.90. We only need Coprocessor, RegionObservor, CommandType interfaces for this purpose. So I didn't include the MapReduce and FilterInterface in the branch (neither for 0.90 I think). You can take a look at that branch. It can pass all HBase test cases, but we still need to improve it a little for exception handling. If you have interests for Mapreduce implementation, you can also refer the first patch of HBASE-2001 . Thanks, Mingjie
          Hide
          Himanshu Vashishtha added a comment -

          agg functions poc

          Show
          Himanshu Vashishtha added a comment - agg functions poc
          Hide
          Himanshu Vashishtha added a comment -

          With the 2001 patch, the basic infrastructure required by these functions is available. I wrote a test class to cover some of these, but am confused about their degree of 'generic'-ness.

          Here, I assumed the user is aware of the table in context and the return types he is getting from the Coprocessor impls, and so the input/output types of these agg operations will also be the same. Therefore he builds agg function classes with those 'types'. I think it is kind of skewed assumption and seeks further clarification. What are the expectations from the 'end interface'?

          I have attached the new/modified classes (2/1).
          a) ProcessResultsFromCP: to be implemented by the agg functions (can be part of the Batch class).
          b) TestAggFunctions: has the test case using the agg functions
          c) HTable: one method to execute the aggregation functions.

          There is high probability that I have twisted the desired feature entirely, so please feel free to 'lambaste' the code and its underlying assumptions.

          PS: I was thinking to make this jira a sub item for jira 2469, but couldn't come up with some thing worth mentioning.

          Show
          Himanshu Vashishtha added a comment - With the 2001 patch, the basic infrastructure required by these functions is available. I wrote a test class to cover some of these, but am confused about their degree of 'generic'-ness. Here, I assumed the user is aware of the table in context and the return types he is getting from the Coprocessor impls, and so the input/output types of these agg operations will also be the same. Therefore he builds agg function classes with those 'types'. I think it is kind of skewed assumption and seeks further clarification. What are the expectations from the 'end interface'? I have attached the new/modified classes (2/1). a) ProcessResultsFromCP: to be implemented by the agg functions (can be part of the Batch class). b) TestAggFunctions: has the test case using the agg functions c) HTable: one method to execute the aggregation functions. There is high probability that I have twisted the desired feature entirely, so please feel free to 'lambaste' the code and its underlying assumptions. PS: I was thinking to make this jira a sub item for jira 2469, but couldn't come up with some thing worth mentioning.
          Hide
          Mingjie Lai added a comment -

          Himanshu.

          The patch looks good. But it doesn't provide the whole picture of the solution. There are still some important questions unanswered for this feature:

          1) what's the interface provided to end users? HTableInterface.sum(...), HTableInterface.min/max()? Do we need shell support?

          2) how to implement the interface? (by utilizing coprocessor)

          3) how to make sure the coprocessor loaded properly if the feature is available.

          You patch addresses part of (2). And it only provides max() and countRow() implementation.

          IMO I don't think ProcessResultsFromCP is necessary. It doesn't really provide any convenience for developers to reduce development effort.

          Thanks.

          Show
          Mingjie Lai added a comment - Himanshu. The patch looks good. But it doesn't provide the whole picture of the solution. There are still some important questions unanswered for this feature: 1) what's the interface provided to end users? HTableInterface.sum(...), HTableInterface.min/max()? Do we need shell support? 2) how to implement the interface? (by utilizing coprocessor) 3) how to make sure the coprocessor loaded properly if the feature is available. You patch addresses part of (2). And it only provides max() and countRow() implementation. IMO I don't think ProcessResultsFromCP is necessary. It doesn't really provide any convenience for developers to reduce development effort. Thanks.
          Hide
          Gary Helmling added a comment -

          Thanks for the patch Himanshu!

          For the scope of the functionality and what sort of aggregation functions you might cover, you might want to start with a comparison of common SQL functions (ex. http://dev.mysql.com/doc/refman/5.5/en/group-by-functions.html). I don't know if you really need to implement all of them, but a good start would probably be:

          • COUNT
          • AVG
          • MIN
          • MAX
          • STD
          • SUM

          (just my opinion of course). All of these would need some form of server side function, and in some cases the client/server coordination might be a little tricky when you have to span regions.

          The client side interface for these also has it's own needs. Does it make sense to be able to combine different client side aggregation functions with unmatched server side functions? Would you want to take a client side minimum of the per-region maximum values returned from the row range? As far as I can see, you would mostly want a single client function paired with a given server-side method.

          I do see that the "raw" HTable.coprocessorExec() interface is a bit clumsy for these types of operations. You really want to be able to return a single value, not a value per region. But I think you can create some client helper methods to hide that complexity.

          For the current client classes ProcessResultsFromCP seems to have a lot of overlap with Batch.Callback. The main difference being that HTable.processResultsFromCP() allows you to pass a list of instances (as opposed to a single Batch.Callback). If using a single Callback instance is limiting, we could allow use of a list of Callbacks, or provide a Batch.callbackList() factory method that allows chaining multiple instances together. But for the common cases here, it seems like you'll want a single client side function (min, max, etc) paired with a single server-side invocation (min, max, etc.), so the current Batch.Callback would probably suffice.

          So as an example on the client side, you could provide a client wrapper in the form:

          {{{
          public class Aggregations {
          private static class ClientSum implements Batch.Callback<Long> {
          private long sum;
          public void update(byte[] region, byte[] row, Long value)

          { sum += value; }

          public long getValue()

          { return sum; }

          }

          public static long sum(HTable table, byte[] start, byte[] end, byte[] family, byte[] col) {
          ClientSum sum = new ClientSum();
          table.coprocessorExec(AggFunctionProtocol.class, start, end,
          new Batch.Call<AggFunctionProtocol,Long>() {
          public Long call(AggFunctionProtocol instance)

          { return instance.sum(family, col); }

          }, sum);
          return sum.getValue();
          }
          }}}

          And so on for the other types of operations... Then clients can just call Aggregations.sum() with the right args.

          There may be better ways to do it, this is just an illustration.

          And, please, if you see ways that HTable.coprocessorExec() can be improved to make this easier, comment on HBASE-2002!

          Show
          Gary Helmling added a comment - Thanks for the patch Himanshu! For the scope of the functionality and what sort of aggregation functions you might cover, you might want to start with a comparison of common SQL functions (ex. http://dev.mysql.com/doc/refman/5.5/en/group-by-functions.html ). I don't know if you really need to implement all of them, but a good start would probably be: COUNT AVG MIN MAX STD SUM (just my opinion of course). All of these would need some form of server side function, and in some cases the client/server coordination might be a little tricky when you have to span regions. The client side interface for these also has it's own needs. Does it make sense to be able to combine different client side aggregation functions with unmatched server side functions? Would you want to take a client side minimum of the per-region maximum values returned from the row range? As far as I can see, you would mostly want a single client function paired with a given server-side method. I do see that the "raw" HTable.coprocessorExec() interface is a bit clumsy for these types of operations. You really want to be able to return a single value, not a value per region. But I think you can create some client helper methods to hide that complexity. For the current client classes ProcessResultsFromCP seems to have a lot of overlap with Batch.Callback. The main difference being that HTable.processResultsFromCP() allows you to pass a list of instances (as opposed to a single Batch.Callback). If using a single Callback instance is limiting, we could allow use of a list of Callbacks, or provide a Batch.callbackList() factory method that allows chaining multiple instances together. But for the common cases here, it seems like you'll want a single client side function (min, max, etc) paired with a single server-side invocation (min, max, etc.), so the current Batch.Callback would probably suffice. So as an example on the client side, you could provide a client wrapper in the form: {{{ public class Aggregations { private static class ClientSum implements Batch.Callback<Long> { private long sum; public void update(byte[] region, byte[] row, Long value) { sum += value; } public long getValue() { return sum; } } public static long sum(HTable table, byte[] start, byte[] end, byte[] family, byte[] col) { ClientSum sum = new ClientSum(); table.coprocessorExec(AggFunctionProtocol.class, start, end, new Batch.Call<AggFunctionProtocol,Long>() { public Long call(AggFunctionProtocol instance) { return instance.sum(family, col); } }, sum); return sum.getValue(); } }}} And so on for the other types of operations... Then clients can just call Aggregations.sum() with the right args. There may be better ways to do it, this is just an illustration. And, please, if you see ways that HTable.coprocessorExec() can be improved to make this easier, comment on HBASE-2002 !
          Hide
          Himanshu Vashishtha added a comment -

          a patch for initial agg functions. It has the functionalities for max, min, sum, avg, rowcount, std. Please suggest further improvements.

          I am looking for Top K & group by like queries. Gary suggested a scanner 'like' functionality for Coprocessor for such queries to reduce ipc, which seems very relevant.
          There are few question here: Is it like the proposed scanner should run on a precomputed resultset and its purpose is just to keep ipc in control by sending a fixed number of rows (cache limit set by client)
          OR
          it should process a fixed number of raw rows from the table (in the default access order) and send its result on the fly (processing means executing the coprocessor code)
          The current scanner functionality that a client uses (RegionScanner) registers itself at the region server level and keep its state there. Calling next() from the client (HTable) results invoking next() on the registered scanner. So, it uses the second option as it is navigating in the table as such.

          There are some common links between coprocessors and current scanner implementations like: with coprocessor, one can intercept the result after every call to next (preScannerNext postScannerNext) and a coprocessor impl can massage the data accordingly. But this is not the purpose of ROs, as it will break th abstraction of RO's like they will be invoked in every client call in that case (inputs from Gary on irc today: just mentioning here for reference).

          Still the above Q holds good: whether cp-scanner should navigate through a computed result set or through raw table rows & invoke the CP impl (essentially an EndPoint impl) there by.
          It can be use case specific. It needs more thought.

          Show
          Himanshu Vashishtha added a comment - a patch for initial agg functions. It has the functionalities for max, min, sum, avg, rowcount, std. Please suggest further improvements. I am looking for Top K & group by like queries. Gary suggested a scanner 'like' functionality for Coprocessor for such queries to reduce ipc, which seems very relevant. There are few question here: Is it like the proposed scanner should run on a precomputed resultset and its purpose is just to keep ipc in control by sending a fixed number of rows (cache limit set by client) OR it should process a fixed number of raw rows from the table (in the default access order) and send its result on the fly (processing means executing the coprocessor code) The current scanner functionality that a client uses (RegionScanner) registers itself at the region server level and keep its state there. Calling next() from the client (HTable) results invoking next() on the registered scanner. So, it uses the second option as it is navigating in the table as such. There are some common links between coprocessors and current scanner implementations like: with coprocessor, one can intercept the result after every call to next (preScannerNext postScannerNext) and a coprocessor impl can massage the data accordingly. But this is not the purpose of ROs, as it will break th abstraction of RO's like they will be invoked in every client call in that case (inputs from Gary on irc today: just mentioning here for reference). Still the above Q holds good: whether cp-scanner should navigate through a computed result set or through raw table rows & invoke the CP impl (essentially an EndPoint impl) there by. It can be use case specific. It needs more thought.
          Hide
          coofucoo zhang added a comment -

          Hi guys, Could you tell me when do you plan to add this feature in Hbase? I think it is a very good feature. We need it very much. Thank you.

          Show
          coofucoo zhang added a comment - Hi guys, Could you tell me when do you plan to add this feature in Hbase? I think it is a very good feature. We need it very much. Thank you.
          Hide
          Lianhui Wang added a comment -

          Great,this feature can resolve the real-time aggregate functions? now we use the redis to do the real-time aggregate in the memory,but when data increased,the memory cannot load the data.so we meet the bottleneck. i think this feature can give us some hope, or others have any suggestions? thank you.

          Show
          Lianhui Wang added a comment - Great,this feature can resolve the real-time aggregate functions? now we use the redis to do the real-time aggregate in the memory,but when data increased,the memory cannot load the data.so we meet the bottleneck. i think this feature can give us some hope, or others have any suggestions? thank you.
          Hide
          stack added a comment -

          FYI, lines should be 80 characters or less.

          What you want here Himanshu?

          +    // sleep here is an ugly hack to allow region transitions to finish
          +    Thread.sleep(5000);
          +    for (JVMClusterUtil.RegionServerThread t : cluster.getRegionServerThreads()) {
          +      for (HRegionInfo r : t.getRegionServer().getOnlineRegions()) {
          +        t.getRegionServer().getOnlineRegion(r.getRegionName()).
          +          getCoprocessorHost().
          +          load(AggregateProtocolImpl.class, //TestAggFunctions.AggFunctionsCT.class,
          +              Coprocessor.Priority.USER);
          +        
          +      }
          +    }
          

          I think there is probably better means of waiting on HRS startup than a timer (FYI, a delay will always fail up on the Apache build server – it has a special knack for doing the unexpected).

          Our convention is spaces around operators. Not...

          +    for(int i=0;i<n;i++) {
          

          ... but

          +    for(int i = 0; i < n; i++) {
          

          See the rest of the code base for examples. These are not biggies. I'm just pointing them out.

          This CP is very cool.

          Would suggest you add more examples to your test. You just test one instance of each aggregate. Shove in some edge cases if you can think of them (Debugging a unit tests is a million times better than trying to debug it out on live cluster).

          Do you think you should be a big more defensive here when scanning? You assume a long:

          +        temp = Bytes.toLong(val.getValue());
          

          Do you think you should check the cell length? If > or < long length, then something is off and WARN?

          This log message is going to drive folks crazy:

          +        log.debug("val read in the region is: "+temp);
          

          In any region of any decent size, there'll almost be a log line per cell?

          The below should be inside a finally:

          +      scanner.close();
          

          Just throw I'd say, don't bother logging:

          +      log.error("Some error occurred. Aborting the computation"+ ie.getMessage());
          +      throw new IOException("Aborting the Max aggregate computation");
          

          Be careful w/ your formatting:

          +    }while(done);
          +    scanner.close();
          +    }catch (IOException ie){ 
          

          Try to be consistent.

          Here is another example:

          +   /**
          +    * For a given column family and column Qualifier for a table, it gives its sum of all its values at the region level.
          +    */
          +  
          +  @Override
          

          Why a Long object instead of just a long primitive type?

          +    Long sum = 0l;
          

          This is messy here formatting-wise:

          +      KeyValue val = results.get(0);
          +      if(val != null) counter++; // TODO: Or shall it only caters to the row, and ignore whether a specific column is null or not. 
          +                                    // Or is it like a val can't be null. Need to look in to all possible values of keyval!
          +    }while(done);
          +    }finally{
          +      scanner.close();  
          +    }
          
          

          Do you think we'll always be acting on a column only? What if someone wants to act on a whole column family; i.e. no qualifier?

          Be careful w/ your white space. For instance in the interface you have this:

          +  List<LongWritable> getAvg(byte[] colFamily, byte[] colQualifier) throws IOException;
          +  List<LongWritable> getStd(byte[] colFamily, byte[] colQualifier) throws IOException;
          +  
          +  
          + 
          +}
          

          Clear those empty lines.

          You should put the javadoc in the Interface on the Interface methods, rather than out in the implementations. Thats how its usually done. The implementations inherit the interface javadoc.

          Fill out the javadoc in your client, in particular description of the return in each case.

          So is stuff being averaged, and summed in the client? Hows that done?

          Patch looks cool.

          Show
          stack added a comment - FYI, lines should be 80 characters or less. What you want here Himanshu? + // sleep here is an ugly hack to allow region transitions to finish + Thread .sleep(5000); + for (JVMClusterUtil.RegionServerThread t : cluster.getRegionServerThreads()) { + for (HRegionInfo r : t.getRegionServer().getOnlineRegions()) { + t.getRegionServer().getOnlineRegion(r.getRegionName()). + getCoprocessorHost(). + load(AggregateProtocolImpl.class, //TestAggFunctions.AggFunctionsCT.class, + Coprocessor.Priority.USER); + + } + } I think there is probably better means of waiting on HRS startup than a timer (FYI, a delay will always fail up on the Apache build server – it has a special knack for doing the unexpected). Our convention is spaces around operators. Not... + for ( int i=0;i<n;i++) { ... but + for ( int i = 0; i < n; i++) { See the rest of the code base for examples. These are not biggies. I'm just pointing them out. This CP is very cool. Would suggest you add more examples to your test. You just test one instance of each aggregate. Shove in some edge cases if you can think of them (Debugging a unit tests is a million times better than trying to debug it out on live cluster). Do you think you should be a big more defensive here when scanning? You assume a long: + temp = Bytes.toLong(val.getValue()); Do you think you should check the cell length? If > or < long length, then something is off and WARN? This log message is going to drive folks crazy: + log.debug( "val read in the region is: " +temp); In any region of any decent size, there'll almost be a log line per cell? The below should be inside a finally: + scanner.close(); Just throw I'd say, don't bother logging: + log.error( "Some error occurred. Aborting the computation" + ie.getMessage()); + throw new IOException( "Aborting the Max aggregate computation" ); Be careful w/ your formatting: + } while (done); + scanner.close(); + } catch (IOException ie){ Try to be consistent. Here is another example: + /** + * For a given column family and column Qualifier for a table, it gives its sum of all its values at the region level. + */ + + @Override Why a Long object instead of just a long primitive type? + Long sum = 0l; This is messy here formatting-wise: + KeyValue val = results.get(0); + if (val != null ) counter++; // TODO: Or shall it only caters to the row, and ignore whether a specific column is null or not. + // Or is it like a val can't be null . Need to look in to all possible values of keyval! + } while (done); + } finally { + scanner.close(); + } Do you think we'll always be acting on a column only? What if someone wants to act on a whole column family; i.e. no qualifier? Be careful w/ your white space. For instance in the interface you have this: + List<LongWritable> getAvg( byte [] colFamily, byte [] colQualifier) throws IOException; + List<LongWritable> getStd( byte [] colFamily, byte [] colQualifier) throws IOException; + + + +} Clear those empty lines. You should put the javadoc in the Interface on the Interface methods, rather than out in the implementations. Thats how its usually done. The implementations inherit the interface javadoc. Fill out the javadoc in your client, in particular description of the return in each case. So is stuff being averaged, and summed in the client? Hows that done? Patch looks cool.
          Hide
          Himanshu Vashishtha added a comment -

          Stack: Thanks for the review.

          I have revamped the patch and also incorporated your suggestions. There were bunch of discrepancies regarding the boundary conditions you mentioned in the previous version, where at the region level there was no knowledge of the exact start/stop rows as given by the user. To achieve this, I modified the agg functions signatures to include start/stop rows at the region level.

          Following are some key aspects for this version:
          a) startEow < endRow is an essential condition now (other than when one is doing a full table scan, where startRow and endRow both are empty byte array). This helps in handling boundary conditions where the start row provided by the user is start row of a region (the default scanner impl returns null because it is a non-get query). Moreover, it is also aligned with the logic of these functions, where one is finding max, min, row count etc.

          b) For all computations like avg, sum, max etc, it is assumed the cell value is a long value (8 bytes); if this is not the case, that cell value is skipped from the computation

          c) For all functions, column family is essential (if it is null, an ioe is returned).
          For max, min, avg, sum,std, when no column qualifier is provided, I aggregate all the values in that family. So, a sum for such a case is group sum of all CQ's values for one row key. I think it is a right approach. Please advice here.

          d) Now in case of rowcount, one can use FirstKeyValueFilter for optimisation. But it may give wrong result in case user has also provided a column qualifier. In such a case, the first value returned by the scanner might belong to other qualifier, but the FirstKeyValueFilter will set its flag to skip to next row, but that value is filtered out from the result set. Its overall effect is that row is not counted and scanner moves to the next row. I used this only when there is no column qualifier. ( I confirmed this during my testing, but will be good to have some comments here).

          d) As suggested, I have added bunch of boundary test cases for each of the six agg functions. Please let me know in case some more are to be added.

          e) Yes, its the client (here AggregationtClient), that will perform the "reduce phase", where individual results from all the target regions are received and accumulated.

          Show
          Himanshu Vashishtha added a comment - Stack: Thanks for the review. I have revamped the patch and also incorporated your suggestions. There were bunch of discrepancies regarding the boundary conditions you mentioned in the previous version, where at the region level there was no knowledge of the exact start/stop rows as given by the user. To achieve this, I modified the agg functions signatures to include start/stop rows at the region level. Following are some key aspects for this version: a) startEow < endRow is an essential condition now (other than when one is doing a full table scan, where startRow and endRow both are empty byte array). This helps in handling boundary conditions where the start row provided by the user is start row of a region (the default scanner impl returns null because it is a non-get query). Moreover, it is also aligned with the logic of these functions, where one is finding max, min, row count etc. b) For all computations like avg, sum, max etc, it is assumed the cell value is a long value (8 bytes); if this is not the case, that cell value is skipped from the computation c) For all functions, column family is essential (if it is null, an ioe is returned). For max, min, avg, sum,std, when no column qualifier is provided, I aggregate all the values in that family. So, a sum for such a case is group sum of all CQ's values for one row key. I think it is a right approach. Please advice here. d) Now in case of rowcount, one can use FirstKeyValueFilter for optimisation. But it may give wrong result in case user has also provided a column qualifier. In such a case, the first value returned by the scanner might belong to other qualifier, but the FirstKeyValueFilter will set its flag to skip to next row, but that value is filtered out from the result set. Its overall effect is that row is not counted and scanner moves to the next row. I used this only when there is no column qualifier. ( I confirmed this during my testing, but will be good to have some comments here). d) As suggested, I have added bunch of boundary test cases for each of the six agg functions. Please let me know in case some more are to be added. e) Yes, its the client (here AggregationtClient), that will perform the "reduce phase", where individual results from all the target regions are received and accumulated.
          Hide
          Himanshu Vashishtha added a comment -

          revised patch

          Show
          Himanshu Vashishtha added a comment - revised patch
          Hide
          Ted Yu added a comment -

          This feature is very useful.
          Is it possible to pass some class to AggregateProtocolImpl which can interpret the type of value based on colFamily:colQualifier ?

          I tried adding type parameter (for type of value) to AggregateCpProtocol but encountered various compilation errors.

          Show
          Ted Yu added a comment - This feature is very useful. Is it possible to pass some class to AggregateProtocolImpl which can interpret the type of value based on colFamily:colQualifier ? I tried adding type parameter (for type of value) to AggregateCpProtocol but encountered various compilation errors.
          Hide
          Ted Yu added a comment -

          I think AggregationClient should have a ctor which accepts Configuration and saves it.
          Then Configuration can be used to point to a table in remote cluster:

              HTable table = new HTable(conf, tableName);
          
          Show
          Ted Yu added a comment - I think AggregationClient should have a ctor which accepts Configuration and saves it. Then Configuration can be used to point to a table in remote cluster: HTable table = new HTable(conf, tableName);
          Hide
          Himanshu Vashishtha added a comment -

          Thanks for reviewing it Ted.

          I will add the constructor.

          yes, I was thinking about this dependency of having a long variable for all these methods. But flexibility of using any data type (by converting it to byte array) for even a specific column family: column qualifier makes it a bit tricky to go for a data type argument. I can have varying number of data types even for one CF:CQ combination. Rather I was considering the option to have one additional check for int type (4 bytes). But that is just me, will be great what others say on it.

          For adding the type parameter to the AggregateCpProtocol methods, there will be dependency with AggregationClient. Did you try adding it there too (apart from its impl).

          Show
          Himanshu Vashishtha added a comment - Thanks for reviewing it Ted. I will add the constructor. yes, I was thinking about this dependency of having a long variable for all these methods. But flexibility of using any data type (by converting it to byte array) for even a specific column family: column qualifier makes it a bit tricky to go for a data type argument. I can have varying number of data types even for one CF:CQ combination. Rather I was considering the option to have one additional check for int type (4 bytes). But that is just me, will be great what others say on it. For adding the type parameter to the AggregateCpProtocol methods, there will be dependency with AggregationClient. Did you try adding it there too (apart from its impl).
          Hide
          Ted Yu added a comment -

          A 4 byte value can represent float. 8 byte value can represent double.

          As for the return type, Long, I tried to make AggregateCpProtocol generic but wasn't successful.
          e.g. AggregateCpProtocol<Long>.class wouldn't compile. Since AggregateCpProtocol is interface, I cannot instantiate and obtain class afterward.

          Show
          Ted Yu added a comment - A 4 byte value can represent float. 8 byte value can represent double. As for the return type, Long, I tried to make AggregateCpProtocol generic but wasn't successful. e.g. AggregateCpProtocol<Long>.class wouldn't compile. Since AggregateCpProtocol is interface, I cannot instantiate and obtain class afterward.
          Hide
          Ted Yu added a comment -

          AggregationClient with ctor accepting Configuration.

          Show
          Ted Yu added a comment - AggregationClient with ctor accepting Configuration.
          Hide
          Ted Yu added a comment -

          Reattaching.
          My attempt is to allow client to pass column interpreter to region server.
          Still more work to be done to make the return type of column interpreter generic.
          Also, column interpreter doesn't have access to table name currently. This may be minor issue because user can enforce uniform naming convention to map column name to actual type.
          TestAggFunctions passes.

          Show
          Ted Yu added a comment - Reattaching. My attempt is to allow client to pass column interpreter to region server. Still more work to be done to make the return type of column interpreter generic. Also, column interpreter doesn't have access to table name currently. This may be minor issue because user can enforce uniform naming convention to map column name to actual type. TestAggFunctions passes.
          Hide
          Ted Yu added a comment -

          Pardon me for attaching source files directly. svn diff doesn't recognize the changes I made on top of patch-1512-2.txt

          Show
          Ted Yu added a comment - Pardon me for attaching source files directly. svn diff doesn't recognize the changes I made on top of patch-1512-2.txt
          Hide
          Ted Yu added a comment -

          Table name can be embedded in LongColumnInterpreter.
          This version of LongColumnInterpreter is simpler.

          Show
          Ted Yu added a comment - Table name can be embedded in LongColumnInterpreter. This version of LongColumnInterpreter is simpler.
          Hide
          Ted Yu added a comment -

          Attaching generic implementation.

          Show
          Ted Yu added a comment - Attaching generic implementation.
          Show
          Ted Yu added a comment - See my thoughts on http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html
          Hide
          Ted Yu added a comment -

          In AggregateProtocolImpl, I think the boolean done should be renamed. It actually indicates whether more rows exist after the current one.
          The following loop condition may confuse someone:

                } while (done);
          
          Show
          Ted Yu added a comment - In AggregateProtocolImpl, I think the boolean done should be renamed. It actually indicates whether more rows exist after the current one. The following loop condition may confuse someone: } while (done);
          Hide
          Himanshu Vashishtha added a comment -

          Thanks for the suggestions Ted.

          a) Added generics functionality to the AggregationClient. As suggested by Ted, there should be a ColumnInterpreter thing to give the client a chance to describe the cell value type. I made this thing generic, in the sense that now client is supposed to give the column interpreter object along with the agg function calls. AggregationClient has such a implementation where client says that its cell value is a long. Other cell values can be used with a similar approach.

          b) While client can define the cell value type by implementing ColumnInterpreter,I still think the average and Standard deviation will be a double value. So, I added a wrapper on these methods to support the generic functionality. Please refer to AggreagationClient.getStdParams & getAvgParams. Let me know if it is "un-intuitive". I think it is right though

          c) Added a filter to each of the agg functions. They are just passed along with the call, and are stuffed in the Scan object at the region level during scanning. In case of row count, if client provides a filter, that one will be used. If neither a filter nor a qualifier is provided, FirstKeyValueFilter is used.

          d) Added more test cases for testing filter use cases (44 in total ).

          e) refactored the "done" variable as suggested by Ted.

          Show
          Himanshu Vashishtha added a comment - Thanks for the suggestions Ted. a) Added generics functionality to the AggregationClient. As suggested by Ted, there should be a ColumnInterpreter thing to give the client a chance to describe the cell value type. I made this thing generic, in the sense that now client is supposed to give the column interpreter object along with the agg function calls. AggregationClient has such a implementation where client says that its cell value is a long. Other cell values can be used with a similar approach. b) While client can define the cell value type by implementing ColumnInterpreter,I still think the average and Standard deviation will be a double value. So, I added a wrapper on these methods to support the generic functionality. Please refer to AggreagationClient.getStdParams & getAvgParams. Let me know if it is "un-intuitive". I think it is right though c) Added a filter to each of the agg functions. They are just passed along with the call, and are stuffed in the Scan object at the region level during scanning. In case of row count, if client provides a filter, that one will be used. If neither a filter nor a qualifier is provided, FirstKeyValueFilter is used. d) Added more test cases for testing filter use cases (44 in total ). e) refactored the "done" variable as suggested by Ted.
          Hide
          Ted Yu added a comment -

          For LongColumnInterpreter.divide(), if l2 is null, I think we should return Double.NaN
          I would write:

                if (l2 == null)
                  return Double.NaN;
                if (l1 == null)
                  return 0;
          

          I think the following method can be named getAvgArgs (argument in place of parameter):

            private <R> List<R> getAvgParams(final byte[] tableName,
          

          But I don't have strong opinion here.

          getAvgParamsAsArray() of AvgCallBack can be named getAvgParams() because its return type is List<>.

          Overall, version 3 is great.

          Show
          Ted Yu added a comment - For LongColumnInterpreter.divide(), if l2 is null, I think we should return Double.NaN I would write: if (l2 == null ) return Double .NaN; if (l1 == null ) return 0; I think the following method can be named getAvgArgs (argument in place of parameter): private <R> List<R> getAvgParams( final byte [] tableName, But I don't have strong opinion here. getAvgParamsAsArray() of AvgCallBack can be named getAvgParams() because its return type is List<>. Overall, version 3 is great.
          Hide
          Ted Yu added a comment -

          Also, I think it is time to move LongColumnInterpreter out into its own file under src/main/java/org/apache/hadoop/hbase/client/coprocessor/.

          Show
          Ted Yu added a comment - Also, I think it is time to move LongColumnInterpreter out into its own file under src/main/java/org/apache/hadoop/hbase/client/coprocessor/.
          Hide
          Himanshu Vashishtha added a comment -

          Thanks for review Ted.

          What I think in the divide method is returning Double.NaN if either of operand is null. Any operation on null should give null.

          Ok for the name refactoring.

          I don't have any strong feeling for making a separate class out if it at this point of time, as it doesn't add much on its own. But will do it if you say so.

          Show
          Himanshu Vashishtha added a comment - Thanks for review Ted. What I think in the divide method is returning Double.NaN if either of operand is null. Any operation on null should give null. Ok for the name refactoring. I don't have any strong feeling for making a separate class out if it at this point of time, as it doesn't add much on its own. But will do it if you say so.
          Hide
          Ted Yu added a comment -

          I think returning Double.NaN is fine. Normally either of operand being null would lead to NPE.
          For making a separate class, it would be easier for users to produce other ColumnInterpreter classes based on LongColumnInterpreter.

          Show
          Ted Yu added a comment - I think returning Double.NaN is fine. Normally either of operand being null would lead to NPE. For making a separate class, it would be easier for users to produce other ColumnInterpreter classes based on LongColumnInterpreter.
          Hide
          Ted Yu added a comment -

          Version 4 is awesome.

          Show
          Ted Yu added a comment - Version 4 is awesome.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          Review request for hbase and Gary Helmling.

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs


          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review429
          -----------------------------------------------------------

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java
          <https://reviews.apache.org/r/585/#comment817>

          While working on a different jira, I saw that I am using wrong (old) key for registering the CP. It was working because in the code that follows this, Agg CP is loaded explicitly (line #102-106).
          One can update this either using the Region CP specific key: CoprocessorHost.REGION_COPROCESSOR_CONF_KEY, and remove the explicit loading below (and remove the explicit loading code below), OR entirely delete this statement.

          • himanshu

          On 2011-04-12 02:28:10, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-12 02:28:10)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review429 ----------------------------------------------------------- /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java < https://reviews.apache.org/r/585/#comment817 > While working on a different jira, I saw that I am using wrong (old) key for registering the CP. It was working because in the code that follows this, Agg CP is loaded explicitly (line #102-106). One can update this either using the Region CP specific key: CoprocessorHost.REGION_COPROCESSOR_CONF_KEY, and remove the explicit loading below (and remove the explicit loading code below), OR entirely delete this statement. himanshu On 2011-04-12 02:28:10, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-12 02:28:10) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          (Updated 2011-04-12 04:28:32.453024)

          Review request for hbase and Gary Helmling.

          Changes
          -------

          Switched to CoprocessorHost.REGION_COPROCESSOR_CONF_KEY and removed the manual loading of CPs.

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-12 04:28:32.453024) Review request for hbase and Gary Helmling. Changes ------- Switched to CoprocessorHost.REGION_COPROCESSOR_CONF_KEY and removed the manual loading of CPs. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          (Updated 2011-04-12 04:41:49.068986)

          Review request for hbase and Gary Helmling.

          Changes
          -------

          Switched to CoprocessorHost.REGION_COPROCESSOR_CONF_KEY and removed the manual loading of CPs.

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-12 04:41:49.068986) Review request for hbase and Gary Helmling. Changes ------- Switched to CoprocessorHost.REGION_COPROCESSOR_CONF_KEY and removed the manual loading of CPs. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review436
          -----------------------------------------------------------

          I read half the patch. Will finish in morning. Comments below. This utility looks really great. Hurry up and finish it!

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment828>

          Its 2011!

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment829>

          There is xtra white space here and elsewhere in this block.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment830>

          should be 'handler'

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment831>

          Do you want to make this actual javadoc link; e.g.

          {@link Aggr....}

          Is AggrationClient misspelled?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment832>

          Is this comment still right? Says 8 byte long (Ted's blog seems to indicate this is not longer the case)

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment833>

          Nice javadoc.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment834>

          Why this constructor? We'll have a null conf? Will that be dangerous later? NPEs?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment835>

          White space

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment836>

          Looks like this comment is no longer true? The method has been genericized?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment837>

          Should you reuse the passed configuration else you are making a new COnnection per invocation.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment838>

          Whats this? The return?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment839>

          Reuse passed conf?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment840>

          Whats this? Xtra white space.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment841>

          Reuse conf creating HTable.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment842>

          Whats this? This prob. is in all subsequent methods... the xtra white space too.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment843>

          This needs to be passed the conf.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment844>

          2011

          • Michael

          On 2011-04-12 04:41:49, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-12 04:41:49)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review436 ----------------------------------------------------------- I read half the patch. Will finish in morning. Comments below. This utility looks really great. Hurry up and finish it! /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment828 > Its 2011! /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment829 > There is xtra white space here and elsewhere in this block. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment830 > should be 'handler' /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment831 > Do you want to make this actual javadoc link; e.g. {@link Aggr....} Is AggrationClient misspelled? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment832 > Is this comment still right? Says 8 byte long (Ted's blog seems to indicate this is not longer the case) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment833 > Nice javadoc. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment834 > Why this constructor? We'll have a null conf? Will that be dangerous later? NPEs? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment835 > White space /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment836 > Looks like this comment is no longer true? The method has been genericized? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment837 > Should you reuse the passed configuration else you are making a new COnnection per invocation. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment838 > Whats this? The return? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment839 > Reuse passed conf? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment840 > Whats this? Xtra white space. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment841 > Reuse conf creating HTable. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment842 > Whats this? This prob. is in all subsequent methods... the xtra white space too. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment843 > This needs to be passed the conf. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment844 > 2011 Michael On 2011-04-12 04:41:49, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-12 04:41:49) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review438
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment855>

          ill effects of copy-paste. will change.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment854>

          Removed it.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment852>

          I use Eclipse formatter (which says it is using Apache's standard, and it is inserting these spaces. I tried to edit the setting to make it work, but couldn't find the way for these extra spaces between doc and arg list. I removed them manually, but want to know the standard approach.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment853>

          Yes, will do it. Thanks.

          • himanshu

          On 2011-04-12 04:41:49, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-12 04:41:49)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review438 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment855 > ill effects of copy-paste. will change. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment854 > Removed it. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment852 > I use Eclipse formatter (which says it is using Apache's standard, and it is inserting these spaces. I tried to edit the setting to make it work, but couldn't find the way for these extra spaces between doc and arg list. I removed them manually, but want to know the standard approach. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment853 > Yes, will do it. Thanks. himanshu On 2011-04-12 04:41:49, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-12 04:41:49) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          Himanshu Vashishtha added a comment -

          Stack reviewed it half on rb (https://reviews.apache.org/r/585/). Incorporated his suggestions and uploading the new patch here, as rb request is initiated by Ted (and I don't think I can upload this under the same rb request).
          Ted, please post this version on rb for further reviews.

          Thanks.

          Show
          Himanshu Vashishtha added a comment - Stack reviewed it half on rb ( https://reviews.apache.org/r/585/ ). Incorporated his suggestions and uploading the new patch here, as rb request is initiated by Ted (and I don't think I can upload this under the same rb request). Ted, please post this version on rb for further reviews. Thanks.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review440
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment863>

          This is the type parameter for return value.

          • Ted

          On 2011-04-12 04:41:49, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-12 04:41:49)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review440 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment863 > This is the type parameter for return value. Ted On 2011-04-12 04:41:49, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-12 04:41:49) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14.182698)

          Review request for hbase and Gary Helmling.

          Changes
          -------

          Himanshu updated the patch according to Stack's suggestions.

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14.182698) Review request for hbase and Gary Helmling. Changes ------- Himanshu updated the patch according to Stack's suggestions. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-13 08:35:42, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 143

          > <https://reviews.apache.org/r/585/diff/3/?file=15640#file15640line143>

          >

          > This is the type parameter for return value.

          ok

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review440
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-13 08:35:42, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 143 > < https://reviews.apache.org/r/585/diff/3/?file=15640#file15640line143 > > > This is the type parameter for return value. ok Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review440 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-13 06:23:56, himanshu vashishtha wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 81

          > <https://reviews.apache.org/r/585/diff/3/?file=15640#file15640line81>

          >

          > I use Eclipse formatter (which says it is using Apache's standard, and it is inserting these spaces. I tried to edit the setting to make it work, but couldn't find the way for these extra spaces between doc and arg list. I removed them manually, but want to know the standard approach.

          None of this kinda of white space is tolerated (well, I'm not too bad about it but others are watching the codebase and will complain loudly if they see these)

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review438
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-13 06:23:56, himanshu vashishtha wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 81 > < https://reviews.apache.org/r/585/diff/3/?file=15640#file15640line81 > > > I use Eclipse formatter (which says it is using Apache's standard, and it is inserting these spaces. I tried to edit the setting to make it work, but couldn't find the way for these extra spaces between doc and arg list. I removed them manually, but want to know the standard approach. None of this kinda of white space is tolerated (well, I'm not too bad about it but others are watching the codebase and will complain loudly if they see these) Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review438 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review466
          -----------------------------------------------------------

          This is review of diff between v3 and v4.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment894>

          Good

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment895>

          Yeah, method doesn't take a <T>, it returns it

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java
          <https://reviews.apache.org/r/585/#comment896>

          This is good.

          • Michael

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review466 ----------------------------------------------------------- This is review of diff between v3 and v4. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment894 > Good /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment895 > Yeah, method doesn't take a <T>, it returns it /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java < https://reviews.apache.org/r/585/#comment896 > This is good. Michael On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review467
          -----------------------------------------------------------

          Here is more. Submitting now in case I lose it.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment897>

          You have double the class comment here.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment898>

          Use Bytes.SIZEOF_LONG instead of '8'

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment899>

          You could autobox and just return Bytes.toLong... I tihnk that'll work (not important). If you are going to use Long, you might use Long.value of because JVM can cache often used Long instances: http://download.oracle.com/javase/1.5.0/docs/api/java/lang/Long.html#valueOf(long)

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment900>

          Is there one instance of this class per thread? Will it be accessed concurrently? Should the base Long be an AtomicLong? Or we need to synchronize updates on the KV?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment901>

          I'd name this multiply rather than 'mult'

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment902>

          You don't need to call through to the super for it to serialize the Writable? And again for the write?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment903>

          Say 'Defines' rather than 'It defines'

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment904>

          Good

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment905>

          Again here say 'Gives' rather than 'It gives..'

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment906>

          Good doc.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment907>

          Whats a region level? Try and have your max doc similar to your min doc with min/max the only diff. Makes it easier on the reader.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment909>

          Why capital 'Q' on qualifier?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment910>

          Return is <T>?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment908>

          Why this empty line?

          • Michael

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review467 ----------------------------------------------------------- Here is more. Submitting now in case I lose it. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment897 > You have double the class comment here. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment898 > Use Bytes.SIZEOF_LONG instead of '8' /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment899 > You could autobox and just return Bytes.toLong... I tihnk that'll work (not important). If you are going to use Long, you might use Long.value of because JVM can cache often used Long instances: http://download.oracle.com/javase/1.5.0/docs/api/java/lang/Long.html#valueOf(long ) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment900 > Is there one instance of this class per thread? Will it be accessed concurrently? Should the base Long be an AtomicLong? Or we need to synchronize updates on the KV? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment901 > I'd name this multiply rather than 'mult' /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment902 > You don't need to call through to the super for it to serialize the Writable? And again for the write? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment903 > Say 'Defines' rather than 'It defines' /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment904 > Good /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment905 > Again here say 'Gives' rather than 'It gives..' /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment906 > Good doc. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment907 > Whats a region level? Try and have your max doc similar to your min doc with min/max the only diff. Makes it easier on the reader. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment909 > Why capital 'Q' on qualifier? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment910 > Return is <T>? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment908 > Why this empty line? Michael On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review469
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment914>

          sure

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment913>

          As per the current usage, it is one instance per thread. This method is called from the concrete coprocessor implementation deployed at region level. Though this instance is a singleton, but its a stateless, hence threadsafe.
          I can change it to AtomicLong if you say so.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment919>

          As this class implements Writable, it is handled by HBaseObjectWritable such that it writes its full class name onto the stream (and goes while reading it at server side). Since this is a stateless, I don't have anything to read write as such.
          No need to call super.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment922>

          Ok. Will do all the formatting changes.

          • himanshu

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review469 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment914 > sure /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment913 > As per the current usage, it is one instance per thread. This method is called from the concrete coprocessor implementation deployed at region level. Though this instance is a singleton, but its a stateless, hence threadsafe. I can change it to AtomicLong if you say so. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment919 > As this class implements Writable, it is handled by HBaseObjectWritable such that it writes its full class name onto the stream (and goes while reading it at server side). Since this is a stateless, I don't have anything to read write as such. No need to call super. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment922 > Ok. Will do all the formatting changes. himanshu On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 06:10:48, himanshu vashishtha wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 66

          > <https://reviews.apache.org/r/585/diff/4/?file=15695#file15695line66>

          >

          > As per the current usage, it is one instance per thread. This method is called from the concrete coprocessor implementation deployed at region level. Though this instance is a singleton, but its a stateless, hence threadsafe.

          > I can change it to AtomicLong if you say so.

          just to clarify, I meant the CP instance is a singleton (pardon my not so good English).

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review469
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 06:10:48, himanshu vashishtha wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 66 > < https://reviews.apache.org/r/585/diff/4/?file=15695#file15695line66 > > > As per the current usage, it is one instance per thread. This method is called from the concrete coprocessor implementation deployed at region level. Though this instance is a singleton, but its a stateless, hence threadsafe. > I can change it to AtomicLong if you say so. just to clarify, I meant the CP instance is a singleton (pardon my not so good English). himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review469 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review468
          -----------------------------------------------------------

          A few comments in the below. See what you think. This is close to commit I'd say.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment916>

          I'd say change the name of this class to AggregateProtocol. Leave off the "Cp' since its in the package name already.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment911>

          'Gives' rather than 'It gives'. Are you repeating yourself i the javadoc here?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment912>

          Good

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment917>

          Call this class AggregateImplementation? It'll implement AggregateProtocol.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment918>

          Class comment explaining what this class does?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment923>

          Why this? When we just made an empty one? And whats the '//' on end of the line.

          Oh, you did this each time through loop.... so you only work on one return at a time....

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment924>

          FYI there is an 'equals' in Bytes so you don't have to do compareTo...0

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment929>

          hash code is what? Can you print out encodedName? Thats better for identifying regions.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment930>

          Its nice that this all genericized.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment939>

          This three part test is used in all methods? Might be big enough to move out to a method (Not important)

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment940>

          Missing period.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment941>

          Missing period.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment942>

          Should you say ColumnInterpreter for AggregateProtocol?

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java
          <https://reviews.apache.org/r/585/#comment943>

          You should call it TestAggregateProtocol or TestAggregateCoprocessor... it should be name of class under test with a Test prefix.

          • Michael

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review468 ----------------------------------------------------------- A few comments in the below. See what you think. This is close to commit I'd say. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment916 > I'd say change the name of this class to AggregateProtocol. Leave off the "Cp' since its in the package name already. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment911 > 'Gives' rather than 'It gives'. Are you repeating yourself i the javadoc here? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment912 > Good /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment917 > Call this class AggregateImplementation? It'll implement AggregateProtocol. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment918 > Class comment explaining what this class does? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment923 > Why this? When we just made an empty one? And whats the '//' on end of the line. Oh, you did this each time through loop.... so you only work on one return at a time.... /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment924 > FYI there is an 'equals' in Bytes so you don't have to do compareTo...0 /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment929 > hash code is what? Can you print out encodedName? Thats better for identifying regions. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment930 > Its nice that this all genericized. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment939 > This three part test is used in all methods? Might be big enough to move out to a method (Not important) /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java < https://reviews.apache.org/r/585/#comment940 > Missing period. /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java < https://reviews.apache.org/r/585/#comment941 > Missing period. /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java < https://reviews.apache.org/r/585/#comment942 > Should you say ColumnInterpreter for AggregateProtocol? /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java < https://reviews.apache.org/r/585/#comment943 > You should call it TestAggregateProtocol or TestAggregateCoprocessor... it should be name of class under test with a Test prefix. Michael On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review470
          -----------------------------------------------------------

          Looks like this is coming along nicely. Some overall comments:

          • A fair amount of naming should be cleaned up. For the client facing methods in AggregationClient, I would go for the simplest names: max() instead of getMaximum(), min() instead of getMinimum(), etc. But that is a matter of personal preference.
          • Think about providing simpler overloaded versions of the methods. Seven parameters is a lot if you don't always care about all of them.
          • Look more closely at the parameterization of some of the methods. I'm not sure it's sufficient for getSum(), getAvg(), getStd(), where the return types may differ from the actual column value types.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment920>

          Don't abbreviate in the javadoc comments. This is part of the end user documentation so you need to spell it all out:

          agg -> aggregation
          RS -> region server
          cp impls -> name the actual coprocessor

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment921>

          agg -> aggregation

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment925>

          Remove trailing whitespace

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment931>

          Overload this with some briefer versions? This is a real mouthful if you don't actually need all 7 parameters.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment932>

          Again add some overloaded simpler versions of this. Do you always need a filter? What about column qualifier? Maybe in most cases you do, just seeing if simplify usage in some cases.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment927>

          What is a row num? Is this the number of rows? How about using "row count" instead? It's more consistent with other HBase tools.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment928>

          name getRowCount() instead?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment934>

          Remove trailing whitespace

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment935>

          Maybe a bit more description of the actual usage here. The client needs to pass an instance of this in AggregationClient methods right? Javadoc should make clear it's purpose and how to use it.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment936>

          Drop the "Cp" from the name, it's extraneous.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment937>

          Is it correct that sum should always return the same type as the individual values? If the values are Integers, you would want to return Long, right? Otherwise you risk overflowing the max value.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment938>

          Is the type correlation correct here as well? Individual values may be Integers, but you may want a double for the average.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment944>

          Same as with getAvg(), wouldn't you want this to possibly return a different type than the individual column values? Like return a double even if the column values are ints?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment945>

          I don't think you need the word "Protocol" in here.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment946>

          Can you just set start and end rows on the scanner instead of checking each row?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment948>

          If there are no results from the scanner would this return the value from ci.getMinValue()? Shouldn't it return null instead?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment947>

          If there are no results from the scanner would this return the value from ci.getMaxValue()?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment949>

          Does this need to be parameterized? Would we ever want the returned row count to not be a Long?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment950>

          Returning a two entry list here is a bit unclear. Why not make it a Pair<Long,Long> or a specifically typed object?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment951>

          Again, I would prefer a more clearly typed simple object here instead of a list with encoded meanings based on index.

          • Gary

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review470 ----------------------------------------------------------- Looks like this is coming along nicely. Some overall comments: A fair amount of naming should be cleaned up. For the client facing methods in AggregationClient, I would go for the simplest names: max() instead of getMaximum(), min() instead of getMinimum(), etc. But that is a matter of personal preference. Think about providing simpler overloaded versions of the methods. Seven parameters is a lot if you don't always care about all of them. Look more closely at the parameterization of some of the methods. I'm not sure it's sufficient for getSum(), getAvg(), getStd(), where the return types may differ from the actual column value types. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment920 > Don't abbreviate in the javadoc comments. This is part of the end user documentation so you need to spell it all out: agg -> aggregation RS -> region server cp impls -> name the actual coprocessor /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment921 > agg -> aggregation /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment925 > Remove trailing whitespace /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment931 > Overload this with some briefer versions? This is a real mouthful if you don't actually need all 7 parameters. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment932 > Again add some overloaded simpler versions of this. Do you always need a filter? What about column qualifier? Maybe in most cases you do, just seeing if simplify usage in some cases. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment927 > What is a row num? Is this the number of rows? How about using "row count" instead? It's more consistent with other HBase tools. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment928 > name getRowCount() instead? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment934 > Remove trailing whitespace /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment935 > Maybe a bit more description of the actual usage here. The client needs to pass an instance of this in AggregationClient methods right? Javadoc should make clear it's purpose and how to use it. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment936 > Drop the "Cp" from the name, it's extraneous. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment937 > Is it correct that sum should always return the same type as the individual values? If the values are Integers, you would want to return Long, right? Otherwise you risk overflowing the max value. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment938 > Is the type correlation correct here as well? Individual values may be Integers, but you may want a double for the average. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment944 > Same as with getAvg(), wouldn't you want this to possibly return a different type than the individual column values? Like return a double even if the column values are ints? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment945 > I don't think you need the word "Protocol" in here. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment946 > Can you just set start and end rows on the scanner instead of checking each row? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment948 > If there are no results from the scanner would this return the value from ci.getMinValue()? Shouldn't it return null instead? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment947 > If there are no results from the scanner would this return the value from ci.getMaxValue()? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment949 > Does this need to be parameterized? Would we ever want the returned row count to not be a Long? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment950 > Returning a two entry list here is a bit unclear. Why not make it a Pair<Long,Long> or a specifically typed object? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment951 > Again, I would prefer a more clearly typed simple object here instead of a list with encoded meanings based on index. Gary On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review474
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment958>

          I think returning ci.getMinValue() is fine because AggregationClient would further consolidate partial results.
          If change is really needed, it should be made in AggregationClient.

          • Ted

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review474 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment958 > I think returning ci.getMinValue() is fine because AggregationClient would further consolidate partial results. If change is really needed, it should be made in AggregationClient. Ted On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 11:50:08, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java, line 73

          > <https://reviews.apache.org/r/585/diff/4/?file=15697#file15697line73>

          >

          > I think returning ci.getMinValue() is fine because AggregationClient would further consolidate partial results.

          > If change is really needed, it should be made in AggregationClient.

          I don't agree. This leaves no way to distinguish between a valid result of Long.MIN_VALUE and no result. What happens for an empty table? I think returning Long.MIN_VALUE (or whatever might be the case) for an empty table is broken.

          • Gary

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review474
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 11:50:08, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java, line 73 > < https://reviews.apache.org/r/585/diff/4/?file=15697#file15697line73 > > > I think returning ci.getMinValue() is fine because AggregationClient would further consolidate partial results. > If change is really needed, it should be made in AggregationClient. I don't agree. This leaves no way to distinguish between a valid result of Long.MIN_VALUE and no result. What happens for an empty table? I think returning Long.MIN_VALUE (or whatever might be the case) for an empty table is broken. Gary ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review474 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review476
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment960>

          w.r.t. Gary's comment, we need another boolean flag in MaxCallBack so that we can distinguish whether MaxCallBack.update() has been called.
          Currently ci.getMinValue() would be returned if there is no qualifying row (possibly due to the effect of Filter).

          • Ted

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review476 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment960 > w.r.t. Gary's comment, we need another boolean flag in MaxCallBack so that we can distinguish whether MaxCallBack.update() has been called. Currently ci.getMinValue() would be returned if there is no qualifying row (possibly due to the effect of Filter). Ted On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 18:02:26, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 98

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line98>

          >

          > w.r.t. Gary's comment, we need another boolean flag in MaxCallBack so that we can distinguish whether MaxCallBack.update() has been called.

          > Currently ci.getMinValue() would be returned if there is no qualifying row (possibly due to the effect of Filter).

          MaxCallBack.update() will still be called once per region, even if no rows matched. It will just return the initial value that was set. This is why I think the initial value should be null. So when update() is called with null, it can be handled appropriately.

          In the same vein, I think "max" in MaxCallBack should initialize to null as well.

          • Gary

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review476
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 18:02:26, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 98 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line98 > > > w.r.t. Gary's comment, we need another boolean flag in MaxCallBack so that we can distinguish whether MaxCallBack.update() has been called. > Currently ci.getMinValue() would be returned if there is no qualifying row (possibly due to the effect of Filter). MaxCallBack.update() will still be called once per region, even if no rows matched. It will just return the initial value that was set. This is why I think the initial value should be null. So when update() is called with null, it can be handled appropriately. In the same vein, I think "max" in MaxCallBack should initialize to null as well. Gary ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review476 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review478
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment962>

          The following code would produce NPE:
          Long l = null;
          if (l < Long.MIN_VALUE) {

          • Ted

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review478 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment962 > The following code would produce NPE: Long l = null; if (l < Long.MIN_VALUE) { Ted On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 18:18:57, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 98

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line98>

          >

          > The following code would produce NPE:

          > Long l = null;

          > if (l < Long.MIN_VALUE) {

          >

          Yes, all this code needs to handle nulls. I think that goes without saying.

          • Gary

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review478
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 18:18:57, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 98 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line98 > > > The following code would produce NPE: > Long l = null; > if (l < Long.MIN_VALUE) { > Yes, all this code needs to handle nulls. I think that goes without saying. Gary ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review478 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review481
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment965>

          Of these, I think only two are optional. colQualifier & filer. OK?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment966>

          Agreed it should be initialized null to handle null resultset.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment967>

          same as the max one above. Yes, in case of a null qualifier, it computes the value for the overall family

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment968>

          ok

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment969>

          ok

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment970>

          Yes, more description should be added.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment971>

          hmm. the return type can be different. I will make it more generic to have a different return type.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment972>

          I thought about it. I return a list but I see its not a right one to pass as one element contains the sum and other contains the rowCount. So, it should be like a Pair as you said. I will look into it.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment973>

          its in the Interface? Shall it be repeated here too?
          Ok, will do the name refactoring.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment974>

          ok, will use the equals method.
          I thought since it is an internal scanner (local to a region), it should not cross out the boundaries while setting start-end rows. Will check it (should also improve performance).

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment975>

          right. a null is more pertinent here. will do it.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment976>

          yes, the current one does return min value. But as you said, it will return null now.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment977>

          I thought about it and then just left it as its only three line of code and a separate call will be kind of over-refactoring. But once I set the start-end row as suggested by Gary, it should become more light.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment978>

          yes indeed. It occurred to me while I saw Stack's review last night and here you are . I will do it.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment979>

          ok. And what if I need to send more than 2 parameters as in case of Standard deviation?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment980>

          you mean a pojo with these many attributes. Is there exists such an object that i can reuse (should be rpc compatible--> like implementing writable).

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java
          <https://reviews.apache.org/r/585/#comment981>

          So yes, will do all the name/space/grammar refactorings as suggested.

          Thanks a lot to all of you folks for this wonderful review.

          • himanshu

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review481 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment965 > Of these, I think only two are optional. colQualifier & filer. OK? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment966 > Agreed it should be initialized null to handle null resultset. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment967 > same as the max one above. Yes, in case of a null qualifier, it computes the value for the overall family /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment968 > ok /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment969 > ok /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment970 > Yes, more description should be added. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment971 > hmm. the return type can be different. I will make it more generic to have a different return type. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment972 > I thought about it. I return a list but I see its not a right one to pass as one element contains the sum and other contains the rowCount. So, it should be like a Pair as you said. I will look into it. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment973 > its in the Interface? Shall it be repeated here too? Ok, will do the name refactoring. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment974 > ok, will use the equals method. I thought since it is an internal scanner (local to a region), it should not cross out the boundaries while setting start-end rows. Will check it (should also improve performance). /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment975 > right. a null is more pertinent here. will do it. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment976 > yes, the current one does return min value. But as you said, it will return null now. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment977 > I thought about it and then just left it as its only three line of code and a separate call will be kind of over-refactoring. But once I set the start-end row as suggested by Gary, it should become more light. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment978 > yes indeed. It occurred to me while I saw Stack's review last night and here you are . I will do it. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment979 > ok. And what if I need to send more than 2 parameters as in case of Standard deviation? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment980 > you mean a pojo with these many attributes. Is there exists such an object that i can reuse (should be rpc compatible--> like implementing writable). /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java < https://reviews.apache.org/r/585/#comment981 > So yes, will do all the name/space/grammar refactorings as suggested. Thanks a lot to all of you folks for this wonderful review. himanshu On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review483
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment984>

          I think the startKey and endKey can be optional as well.
          Basically that means scanning the whole region.

          • Ted

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review483 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment984 > I think the startKey and endKey can be optional as well. Basically that means scanning the whole region. Ted On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 19:06:58, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > I think the startKey and endKey can be optional as well.

          > Basically that means scanning the whole region.

          These start-end keys are used to locate the interested regions. Do you mean whole table? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays.

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review483
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 19:06:58, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > I think the startKey and endKey can be optional as well. > Basically that means scanning the whole region. These start-end keys are used to locate the interested regions. Do you mean whole table ? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays. himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review483 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 19:06:58, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > I think the startKey and endKey can be optional as well.

          > Basically that means scanning the whole region.

          himanshu vashishtha wrote:

          These start-end keys are used to locate the interested regions. Do you mean whole table? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays.

          This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter?

          It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think?

          • Gary

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review483
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 19:06:58, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > I think the startKey and endKey can be optional as well. > Basically that means scanning the whole region. himanshu vashishtha wrote: These start-end keys are used to locate the interested regions. Do you mean whole table ? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays. This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter? It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think? Gary ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review483 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 19:06:58, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > I think the startKey and endKey can be optional as well.

          > Basically that means scanning the whole region.

          himanshu vashishtha wrote:

          These start-end keys are used to locate the interested regions. Do you mean whole table? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays.

          Gary Helmling wrote:

          This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter?

          It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think?

          Yes, am wondering why it didn't occur to me before! As a matter of fact, we are creating a Scan object at region level. So, with passing the Scan object to the Aggregation client, it will call the appropriate HTable method (the existing one), but the CP's method will take the Scan object as a parameter, and let the client have its liberty. But it needs some code changes, like in validation stuff for one.
          (I was thinking that it was good to go and now there is so much room for improvement. Good stuff).

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review483
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 19:06:58, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > I think the startKey and endKey can be optional as well. > Basically that means scanning the whole region. himanshu vashishtha wrote: These start-end keys are used to locate the interested regions. Do you mean whole table ? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays. Gary Helmling wrote: This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter? It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think? Yes, am wondering why it didn't occur to me before! As a matter of fact, we are creating a Scan object at region level. So, with passing the Scan object to the Aggregation client, it will call the appropriate HTable method (the existing one), but the CP's method will take the Scan object as a parameter, and let the client have its liberty. But it needs some code changes, like in validation stuff for one. (I was thinking that it was good to go and now there is so much room for improvement. Good stuff). himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review483 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 19:06:58, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > I think the startKey and endKey can be optional as well.

          > Basically that means scanning the whole region.

          himanshu vashishtha wrote:

          These start-end keys are used to locate the interested regions. Do you mean whole table? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays.

          Gary Helmling wrote:

          This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter?

          It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think?

          himanshu vashishtha wrote:

          Yes, am wondering why it didn't occur to me before! As a matter of fact, we are creating a Scan object at region level. So, with passing the Scan object to the Aggregation client, it will call the appropriate HTable method (the existing one), but the CP's method will take the Scan object as a parameter, and let the client have its liberty. But it needs some code changes, like in validation stuff for one.

          (I was thinking that it was good to go and now there is so much room for improvement. Good stuff).

          In continuation of what I earlier said, in the current design we assume that client is interested in one family only. Shall this needs to be change too.
          I am refactoring these methods to let the client pass a Scan object to the AggregationClient class, but a scan object as such can have multi families in it. Shall we need to change this assumption. I don't see any issue with it as such, but this is something I didn't plan originally and it needs change in test cases. Please comment.

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review483
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 19:06:58, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > I think the startKey and endKey can be optional as well. > Basically that means scanning the whole region. himanshu vashishtha wrote: These start-end keys are used to locate the interested regions. Do you mean whole table ? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays. Gary Helmling wrote: This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter? It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think? himanshu vashishtha wrote: Yes, am wondering why it didn't occur to me before! As a matter of fact, we are creating a Scan object at region level. So, with passing the Scan object to the Aggregation client, it will call the appropriate HTable method (the existing one), but the CP's method will take the Scan object as a parameter, and let the client have its liberty. But it needs some code changes, like in validation stuff for one. (I was thinking that it was good to go and now there is so much room for improvement. Good stuff). In continuation of what I earlier said, in the current design we assume that client is interested in one family only. Shall this needs to be change too. I am refactoring these methods to let the client pass a Scan object to the AggregationClient class, but a scan object as such can have multi families in it. Shall we need to change this assumption. I don't see any issue with it as such, but this is something I didn't plan originally and it needs change in test cases. Please comment. himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review483 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review488
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment989>

          This is the first code review that evolves into a design session in my career - exciting.
          I think we should relax the initial assumption.

          • Ted

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review488 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment989 > This is the first code review that evolves into a design session in my career - exciting. I think we should relax the initial assumption. Ted On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 20:21:01, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > This is the first code review that evolves into a design session in my career - exciting.

          > I think we should relax the initial assumption.

          I still think that I would go with one family, as the families are quite separate entities as such(HTable design wise), and I don't see any usage of doing aggregates on accumulated column families. If that is what is needed probably suggests some schema design rethinking. The point I raised was that the object we are now riding upon supports multiple families (which is very relevant for scanning a table), but we don't need it as per real usage. So, shall we support or not, this is the point of consideration.
          Moreover, as the requirements are evolving (and I guess they will continue to do so), it might change again. I am happy as long as it is moving in the right direction.

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review488
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 20:21:01, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > This is the first code review that evolves into a design session in my career - exciting. > I think we should relax the initial assumption. I still think that I would go with one family, as the families are quite separate entities as such(HTable design wise), and I don't see any usage of doing aggregates on accumulated column families. If that is what is needed probably suggests some schema design rethinking. The point I raised was that the object we are now riding upon supports multiple families (which is very relevant for scanning a table), but we don't need it as per real usage. So, shall we support or not, this is the point of consideration. Moreover, as the requirements are evolving (and I guess they will continue to do so), it might change again. I am happy as long as it is moving in the right direction. himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review488 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review490
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment993>

          One family is fine for the moment.

          • Ted

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review490 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment993 > One family is fine for the moment. Ted On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 19:06:58, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > I think the startKey and endKey can be optional as well.

          > Basically that means scanning the whole region.

          himanshu vashishtha wrote:

          These start-end keys are used to locate the interested regions. Do you mean whole table? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays.

          Gary Helmling wrote:

          This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter?

          It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think?

          himanshu vashishtha wrote:

          Yes, am wondering why it didn't occur to me before! As a matter of fact, we are creating a Scan object at region level. So, with passing the Scan object to the Aggregation client, it will call the appropriate HTable method (the existing one), but the CP's method will take the Scan object as a parameter, and let the client have its liberty. But it needs some code changes, like in validation stuff for one.

          (I was thinking that it was good to go and now there is so much room for improvement. Good stuff).

          himanshu vashishtha wrote:

          In continuation of what I earlier said, in the current design we assume that client is interested in one family only. Shall this needs to be change too.

          I am refactoring these methods to let the client pass a Scan object to the AggregationClient class, but a scan object as such can have multi families in it. Shall we need to change this assumption. I don't see any issue with it as such, but this is something I didn't plan originally and it needs change in test cases. Please comment.

          I refactored a agg method 1512 as per today's review (using scan object plus others) and its working fine (test passes for the method that i change). May be I need to add more boundary conditions to test the scan object.
          I have some stuff for tonight/tomorrow, so will complete this by tomorrow night or by Sunday. I hope that should be ok

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review483
          -----------------------------------------------------------

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 19:06:58, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > I think the startKey and endKey can be optional as well. > Basically that means scanning the whole region. himanshu vashishtha wrote: These start-end keys are used to locate the interested regions. Do you mean whole table ? If so, it will be like setting HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays. Gary Helmling wrote: This would be a bigger change, but maybe it would make sense to have the client pass a Scan object? Then you could specify start/end row, time range, multiple column qualifiers, filter? It's starting to look like we're duplicating most of these arguments when there's already a good way of passing them. What do you think? himanshu vashishtha wrote: Yes, am wondering why it didn't occur to me before! As a matter of fact, we are creating a Scan object at region level. So, with passing the Scan object to the Aggregation client, it will call the appropriate HTable method (the existing one), but the CP's method will take the Scan object as a parameter, and let the client have its liberty. But it needs some code changes, like in validation stuff for one. (I was thinking that it was good to go and now there is so much room for improvement. Good stuff). himanshu vashishtha wrote: In continuation of what I earlier said, in the current design we assume that client is interested in one family only. Shall this needs to be change too. I am refactoring these methods to let the client pass a Scan object to the AggregationClient class, but a scan object as such can have multi families in it. Shall we need to change this assumption. I don't see any issue with it as such, but this is something I didn't plan originally and it needs change in test cases. Please comment. I refactored a agg method 1512 as per today's review (using scan object plus others) and its working fine (test passes for the method that i change). May be I need to add more boundary conditions to test the scan object. I have some stuff for tonight/tomorrow, so will complete this by tomorrow night or by Sunday. I hope that should be ok himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review483 ----------------------------------------------------------- On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review493
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1002>

          done

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1003>

          done

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1004>

          done

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1005>

          done

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1006>

          ok, using only Bytes.toLong now.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1007>

          done

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1008>

          done

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1009>

          removed the repeatition in the doc.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1010>

          Column interpreter is more genericised now. It supports HBase cell data type and its promoted data type. For doing these computations, we will use this promoted data type. So, in case a cell value is int, we will be using a long type while computing sum to handle overflow. In case of finding max and min, we will still use the cell data type.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1011>

          coprocessor implementation returns a over all sum and row count, so no need to use a double/float in the return type. It is used at the Client side (AggregationClient).

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1012>

          done

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1013>

          added class java doc.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1014>

          refactored it

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1015>

          setting start/end rows does it. So, no need of checking now.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1016>

          done

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1017>

          now, current version returns a Pair<List<S>, Long> where the list contains the sum and sum of squares and Long is the row count. I can have a more specific object, but it seems it has to be added in the rpc stack (implementing Writable). Please comment if that is the right way.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1001>

          You mean to rename the class or just the javadoc (sorry I missed this review statement initially)

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1018>

          This class is more genericised now. It defines two parameters type <T,Sbq. , where T is the cell value type and S is the promoted data type. S is used for doing arithmetic computations, T is used for finding min, max operation.

          • himanshu

          On 2011-04-13 08:37:14, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-13 08:37:14)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review493 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1002 > done /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1003 > done /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1004 > done /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1005 > done /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1006 > ok, using only Bytes.toLong now. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1007 > done /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1008 > done /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1009 > removed the repeatition in the doc. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1010 > Column interpreter is more genericised now. It supports HBase cell data type and its promoted data type. For doing these computations, we will use this promoted data type. So, in case a cell value is int, we will be using a long type while computing sum to handle overflow. In case of finding max and min, we will still use the cell data type. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1011 > coprocessor implementation returns a over all sum and row count, so no need to use a double/float in the return type. It is used at the Client side (AggregationClient). /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1012 > done /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1013 > added class java doc. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1014 > refactored it /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1015 > setting start/end rows does it. So, no need of checking now. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1016 > done /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1017 > now, current version returns a Pair<List<S>, Long> where the list contains the sum and sum of squares and Long is the row count. I can have a more specific object, but it seems it has to be added in the rpc stack (implementing Writable). Please comment if that is the right way. /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1001 > You mean to rename the class or just the javadoc (sorry I missed this review statement initially) /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1018 > This class is more genericised now. It defines two parameters type <T,Sbq. , where T is the cell value type and S is the promoted data type. S is used for doing arithmetic computations, T is used for finding min, max operation. himanshu On 2011-04-13 08:37:14, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-13 08:37:14) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          Himanshu Vashishtha added a comment -

          Changes done as per review by Gary and Stack.

          Show
          Himanshu Vashishtha added a comment - Changes done as per review by Gary and Stack.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          (Updated 2011-04-18 17:14:30.424344)

          Review request for hbase and Gary Helmling.

          Changes
          -------

          Latest update from Himanshu.

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-18 17:14:30.424344) Review request for hbase and Gary Helmling. Changes ------- Latest update from Himanshu. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-15 21:16:05, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84

          > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>

          >

          > One family is fine for the moment.

          FYI, more families can easily be added as client is passing the Scan object as such. It boils down to fetching the value of a cell from the columndescripter's getValue which takes family, qualifier and kv value; And then grouping it with other family values, but again I believe it is not the purpose of this jira ==> aggregating over multiple families. Please correct me if I am wrong.

          • himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review490
          -----------------------------------------------------------

          On 2011-04-18 17:14:30, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-18 17:14:30)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-15 21:16:05, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java, line 84 > < https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84 > > > One family is fine for the moment. FYI, more families can easily be added as client is passing the Scan object as such. It boils down to fetching the value of a cell from the columndescripter's getValue which takes family, qualifier and kv value; And then grouping it with other family values, but again I believe it is not the purpose of this jira ==> aggregating over multiple families. Please correct me if I am wrong. himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review490 ----------------------------------------------------------- On 2011-04-18 17:14:30, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-18 17:14:30) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          Himanshu Vashishtha added a comment -

          removed an unnecessary import statement from LongColumnInterpreter. Thanks for pointing this Ted.

          Show
          Himanshu Vashishtha added a comment - removed an unnecessary import statement from LongColumnInterpreter. Thanks for pointing this Ted.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37.773505)

          Review request for hbase and Gary Helmling.

          Changes
          -------

          import statement of TestAggregateProtocol is removed in LongColumnInterpreter

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37.773505) Review request for hbase and Gary Helmling. Changes ------- import statement of TestAggregateProtocol is removed in LongColumnInterpreter Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review535
          -----------------------------------------------------------

          I think its almost there. This patch won't compile (see below for why). I'd be game for applying the next version. This patch has come on a long way. Lets make new issues after applying it for issues found in it (This patch does include a nice set of tests).

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1116>

          I agree w/ the review that suggested we spell out 'agg' rather than use the abbreviation, especially in javadoc.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1117>

          if should be 'where'. Should we throw an exception if multiple families supplied so users are not surprised when they don't get answers for multiple families?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1118>

          I'd say leave implementation details out of the public javadoc (the bit about calling private methods)

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1119>

          Does Scan do this test? Internally? (I'm not sure)

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1120>

          'should' or 'does'? I think you want to say the latter?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1121>

          Why this javadoc? Don't we inherit javadoc from the Interface?

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1122>

          Whats this? We do nothing on serialization? Is that right? It could be. It just strikes me as a little odd. Maybe put a comment in here to say 'nothing to serialize'?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1123>

          Do we agree that AggregateCpProtocol was not a good name, that rather it should be AggregateProtocol since cp is in the package name?

          I see you have a AP later in this patch. Let me go look at it.

          I think I see whats going on... you didn't mean to include this in the patch?

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1124>

          Otherwise, this Interface looks good.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
          <https://reviews.apache.org/r/585/#comment1125>

          Yeah, this class shouldn't be included either.

          • Michael

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review535 ----------------------------------------------------------- I think its almost there. This patch won't compile (see below for why). I'd be game for applying the next version. This patch has come on a long way. Lets make new issues after applying it for issues found in it (This patch does include a nice set of tests). /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1116 > I agree w/ the review that suggested we spell out 'agg' rather than use the abbreviation, especially in javadoc. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1117 > if should be 'where'. Should we throw an exception if multiple families supplied so users are not surprised when they don't get answers for multiple families? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1118 > I'd say leave implementation details out of the public javadoc (the bit about calling private methods) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1119 > Does Scan do this test? Internally? (I'm not sure) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1120 > 'should' or 'does'? I think you want to say the latter? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1121 > Why this javadoc? Don't we inherit javadoc from the Interface? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1122 > Whats this? We do nothing on serialization? Is that right? It could be. It just strikes me as a little odd. Maybe put a comment in here to say 'nothing to serialize'? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1123 > Do we agree that AggregateCpProtocol was not a good name, that rather it should be AggregateProtocol since cp is in the package name? I see you have a AP later in this patch. Let me go look at it. I think I see whats going on... you didn't mean to include this in the patch? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1124 > Otherwise, this Interface looks good. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java < https://reviews.apache.org/r/585/#comment1125 > Yeah, this class shouldn't be included either. Michael On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review536
          -----------------------------------------------------------

          I think its almost there. This patch won't compile (see below for why). I'd be game for applying the next version. This patch has come on a long way. Lets make new issues after applying it for issues found in it (This patch does include a nice set of tests).

          • Michael

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review536 ----------------------------------------------------------- I think its almost there. This patch won't compile (see below for why). I'd be game for applying the next version. This patch has come on a long way. Lets make new issues after applying it for issues found in it (This patch does include a nice set of tests). Michael On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review537
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1126>

          In my version, I have:
          public interface ColumnInterpreter<T> extends Serializable {

          There is no need to extend Writable.

          • Ted

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review537 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1126 > In my version, I have: public interface ColumnInterpreter<T> extends Serializable { There is no need to extend Writable. Ted On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 18:31:17, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99

          > <https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99>

          >

          > In my version, I have:

          > public interface ColumnInterpreter<T> extends Serializable {

          >

          > There is no need to extend Writable.

          Ok. Then we should remove that from the patch (Good one Ted)

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review537
          -----------------------------------------------------------

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 18:31:17, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99 > < https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99 > > > In my version, I have: > public interface ColumnInterpreter<T> extends Serializable { > > There is no need to extend Writable. Ok. Then we should remove that from the patch (Good one Ted) Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review537 ----------------------------------------------------------- On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 18:31:17, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99

          > <https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99>

          >

          > In my version, I have:

          > public interface ColumnInterpreter<T> extends Serializable {

          >

          > There is no need to extend Writable.

          Michael Stack wrote:

          Ok. Then we should remove that from the patch (Good one Ted)

          Whoops. Sorry, did you say Serializeable Ted as in java.io.serializable? We don't want j.i.s. Thats what Writable replaces.

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review537
          -----------------------------------------------------------

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 18:31:17, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99 > < https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99 > > > In my version, I have: > public interface ColumnInterpreter<T> extends Serializable { > > There is no need to extend Writable. Michael Stack wrote: Ok. Then we should remove that from the patch (Good one Ted) Whoops. Sorry, did you say Serializeable Ted as in java.io.serializable? We don't want j.i.s. Thats what Writable replaces. Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review537 ----------------------------------------------------------- On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review540
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1129>

          Yes, but it seems Writable is the standard way to go in Hadoop for these RPCs. No big issue either way.
          Since it doesn't have any state, there is nothing to serialize here. It seems we can have make this as static util class .

          • Himanshu

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review540 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1129 > Yes, but it seems Writable is the standard way to go in Hadoop for these RPCs. No big issue either way. Since it doesn't have any state, there is nothing to serialize here. It seems we can have make this as static util class . Himanshu On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review538
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1127>

          I exchanged emails with Himanshu about supporting multiple column families.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1131>

          I prefer using Serializable for the interpreter which is stateless.

          It is supported by HbaseObjectWritable.

          • Ted

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review538 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1127 > I exchanged emails with Himanshu about supporting multiple column families. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1131 > I prefer using Serializable for the interpreter which is stateless. It is supported by HbaseObjectWritable. Ted On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review542
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1132>

          ok. now it throws an exception when > 1 families are defined.

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1133>

          removed the javadoc related to private method calls

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
          <https://reviews.apache.org/r/585/#comment1134>

          I didn't want it to propagate to the server just to return an exception. I thought that aggregate functions should work on distinct set of rows, ie, startRow < endRow should always be true (except when they are empty).
          There is a check in HTable-> getStartKeysInRange() that throws an exception when startRow > endRow, but I needed the boundary condition too.
          Please let me know if this condition we should remove this condition.

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
          <https://reviews.apache.org/r/585/#comment1135>

          Yes, its not there in later versions.

          • Himanshu

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review542 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1132 > ok. now it throws an exception when > 1 families are defined. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1133 > removed the javadoc related to private method calls /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java < https://reviews.apache.org/r/585/#comment1134 > I didn't want it to propagate to the server just to return an exception. I thought that aggregate functions should work on distinct set of rows, ie, startRow < endRow should always be true (except when they are empty). There is a check in HTable-> getStartKeysInRange() that throws an exception when startRow > endRow, but I needed the boundary condition too. Please let me know if this condition we should remove this condition. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java < https://reviews.apache.org/r/585/#comment1135 > Yes, its not there in later versions. Himanshu On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 18:58:37, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99

          > <https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99>

          >

          > I prefer using Serializable for the interpreter which is stateless.

          >

          > It is supported by HbaseObjectWritable.

          >

          My personal opinion is it will not go well with others if one uses Serializable. It is supported in HBaseObjectWritable only for legacy reasons I believe.

          • Himanshu

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review538
          -----------------------------------------------------------

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 18:58:37, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99 > < https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99 > > > I prefer using Serializable for the interpreter which is stateless. > > It is supported by HbaseObjectWritable. > My personal opinion is it will not go well with others if one uses Serializable. It is supported in HBaseObjectWritable only for legacy reasons I believe. Himanshu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review538 ----------------------------------------------------------- On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          Himanshu Vashishtha added a comment -

          Changes as per Stack's review.
          Major changes include:
          a) LongColumnInterpreter still implements Writable (though with empty read/write methods).
          b) Exception is thrown in case of more than one family is defined.

          Show
          Himanshu Vashishtha added a comment - Changes as per Stack's review. Major changes include: a) LongColumnInterpreter still implements Writable (though with empty read/write methods). b) Exception is thrown in case of more than one family is defined.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review546
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1140>

          There are 10 classes which implement Serializable under src/main/java/org/apache/hadoop/hbase/rest/model/

          And:
          public class Pair<T1, T2> implements Serializable
          ./src/main/java/org/apache/hadoop/hbase/util/Pair.java
          public static class CustomSerializable implements Serializable {
          ./src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java

          • Ted

          On 2011-04-23 16:39:37, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-23 16:39:37)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review546 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1140 > There are 10 classes which implement Serializable under src/main/java/org/apache/hadoop/hbase/rest/model/ And: public class Pair<T1, T2> implements Serializable ./src/main/java/org/apache/hadoop/hbase/util/Pair.java public static class CustomSerializable implements Serializable { ./src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java Ted On 2011-04-23 16:39:37, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-23 16:39:37) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 19:44:07, Ted Yu wrote:

          > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99

          > <https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99>

          >

          > There are 10 classes which implement Serializable under src/main/java/org/apache/hadoop/hbase/rest/model/

          >

          > And:

          > public class Pair<T1, T2> implements Serializable

          > ./src/main/java/org/apache/hadoop/hbase/util/Pair.java

          > public static class CustomSerializable implements Serializable {

          > ./src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java

          My guess is that for REST, its because of the REST engine we use. For Pair, we should probably change it to be Writable. Same for CustomSerializable.

          Otherwise, Ted, if you +1 Himanshu's patch, I'll commit it.

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review546
          -----------------------------------------------------------

          On 2011-04-25 19:53:33, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-25 19:53:33)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/CursorCallable.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/CursorCp.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/HTable.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 19:44:07, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java, line 99 > < https://reviews.apache.org/r/585/diff/3-6/?file=15641#file15641line99 > > > There are 10 classes which implement Serializable under src/main/java/org/apache/hadoop/hbase/rest/model/ > > And: > public class Pair<T1, T2> implements Serializable > ./src/main/java/org/apache/hadoop/hbase/util/Pair.java > public static class CustomSerializable implements Serializable { > ./src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java My guess is that for REST, its because of the REST engine we use. For Pair, we should probably change it to be Writable. Same for CustomSerializable. Otherwise, Ted, if you +1 Himanshu's patch, I'll commit it. Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review546 ----------------------------------------------------------- On 2011-04-25 19:53:33, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-25 19:53:33) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/CursorCallable.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/CursorCp.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/HTable.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/
          -----------------------------------------------------------

          (Updated 2011-04-25 19:53:33.129920)

          Review request for hbase and Gary Helmling.

          Changes
          -------

          Changes as per Stack's review.
          Major changes include:
          a) LongColumnInterpreter still implements Writable (though with empty read/write methods).
          b) Exception is thrown in case of more than one family is defined.

          Summary
          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.
          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.
          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.
          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/client/CursorCallable.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/CursorCp.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1088894
          /src/main/java/org/apache/hadoop/hbase/client/HTable.java 1088894
          /src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1088894
          /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1088894
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION
          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION
          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing
          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-25 19:53:33.129920) Review request for hbase and Gary Helmling. Changes ------- Changes as per Stack's review. Major changes include: a) LongColumnInterpreter still implements Writable (though with empty read/write methods). b) Exception is thrown in case of more than one family is defined. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/client/CursorCallable.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/CursorCp.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/HTable.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/585/#review548
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
          <https://reviews.apache.org/r/585/#comment1145>

          I am fine with using Writable.

          • Ted

          On 2011-04-25 19:53:33, Ted Yu wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/585/

          -----------------------------------------------------------

          (Updated 2011-04-25 19:53:33)

          Review request for hbase and Gary Helmling.

          Summary

          -------

          This patch provides reference implementation for aggregate function support through Coprocessor framework.

          ColumnInterpreter interface allows client to specify how the value's byte array is interpreted.

          Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html

          Himanshu Vashishtha started the work. I provided some review comments and some of the code.

          This addresses bug HBASE-1512.

          https://issues.apache.org/jira/browse/HBASE-1512

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/client/CursorCallable.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/CursorCp.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/HTable.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1088894

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION

          /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION

          /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION

          Diff: https://reviews.apache.org/r/585/diff

          Testing

          -------

          TestAggFunctions passes.

          Thanks,

          Ted

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review548 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java < https://reviews.apache.org/r/585/#comment1145 > I am fine with using Writable. Ted On 2011-04-25 19:53:33, Ted Yu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/ ----------------------------------------------------------- (Updated 2011-04-25 19:53:33) Review request for hbase and Gary Helmling. Summary ------- This patch provides reference implementation for aggregate function support through Coprocessor framework. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html Himanshu Vashishtha started the work. I provided some review comments and some of the code. This addresses bug HBASE-1512 . https://issues.apache.org/jira/browse/HBASE-1512 Diffs ----- /src/main/java/org/apache/hadoop/hbase/client/CursorCallable.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/CursorCp.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/HTable.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1088894 /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION Diff: https://reviews.apache.org/r/585/diff Testing ------- TestAggFunctions passes. Thanks, Ted
          Hide
          stack added a comment -

          @Himanshu I think you uploaded the wrong patch for v8. Its all about CursorCallable...

          Show
          stack added a comment - @Himanshu I think you uploaded the wrong patch for v8. Its all about CursorCallable...
          Hide
          Himanshu Vashishtha added a comment -

          My bad, uploaded a wrong patch. Sorry for the mishap Stack.

          Show
          Himanshu Vashishtha added a comment - My bad, uploaded a wrong patch. Sorry for the mishap Stack.
          Hide
          stack added a comment -

          Assigning Himanshu

          Show
          stack added a comment - Assigning Himanshu
          Hide
          stack added a comment -

          Committed to TRUNK. Thank you for the nice feature Himanshu. Nice counseling and reviewing Ted.

          Show
          stack added a comment - Committed to TRUNK. Thank you for the nice feature Himanshu. Nice counseling and reviewing Ted.
          Hide
          Himanshu Vashishtha added a comment -

          Its an addendum that incorporates Null handling of coprocessor results and AggregationClient validation. Build it on a clean main trunk and it seems to work good for TestAggregateProtocol, TestServerCustomProtocol and TestCoprocessEndpoint.

          Show
          Himanshu Vashishtha added a comment - Its an addendum that incorporates Null handling of coprocessor results and AggregationClient validation. Build it on a clean main trunk and it seems to work good for TestAggregateProtocol, TestServerCustomProtocol and TestCoprocessEndpoint.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #1888 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1888/)

          Show
          Hudson added a comment - Integrated in HBase-TRUNK #1888 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1888/ )

            People

            • Assignee:
              Himanshu Vashishtha
              Reporter:
              stack
            • Votes:
              4 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development