Thanks for your advice James Taylor.
By reading the material you provides, I have a better understand about this issue and also have already generated an initial plan.
However, there are still several points that I'm quite confusing.
1. I plan to draft a proposal which will implement the window functions when the window is in the format of [ PARTITION BY expression [, expression ]* ] for Apache Phenix. In other words, it adds support for the keyword: PARTITION BY. I want to know whether the workload is enough for a GSOC term.
2. About this issue PHOENIX-2700 , I notice a suitable solution is to implement the sliding window which can improve the performance by reducing unnecessary data translation. However, in my opinion, it only works when child query exists especially when the child query is OLAP query. For example, if we have a sample query like
PARTITION BY country_name) AS country_population,
PARTITION BY state_name) AS state_population,
PARTITION BY county_name ) AS county_population
In this case, I think the sliding window may not benefit the performance. The sliding window is not the basis of window functions, but the improvement.
Is that right?
3. When we have an SQL contains 'PARTITION BY partition_key', I think we should guarantee each partion_key only spread in only one region server, otherwise, the situation could be quite tricky. Nonetheless, I can't find an appropriate way to guarantee it. If we have a restriction in DDL it is not a universal solution. If we just throw an exception, it is not user-friendly. Would you mind giving me any suggestions?