Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Hourly and Daily rolling are currently done using a M/R but all spill files are already sorted so it's just a Merged sort.
Doing that from a standalone application will be more efficient than using a M/R.
Another way to implement this will be to take advantage of the latest version of Pig (multiple queries optimization) and do the rolling once a day at the same time as we are computing daily metrics (Since the data has already been loaded by pig).
Attachments
Issue Links
- relates to
-
CHUKWA-317 cleaner support for archiving chunks
- Resolved