[CHUKWA-146] All hadoop logs should use a different RecordType - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Critical
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

All hadoop logs are using the same RecordType, so only 1 Reducer is used to process all log files (other than DN,NN,Audit).
This cause a SKU issue at the M/R level.
So all hadoop logs should use a different RecordType.

Note:

using the cluster information in the ChukwaRecordPartitioner will also help.
using a predefine list of recordType/reducer association will also help by avoiding to have 2 log RecordType going to the same reducer,
the dynamic affectation ( ( hashCode() & Integer.MAX_VALUE) % numReduceTasks) could be used at a fallback mechanism

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jerome Boulon

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 17/Apr/09 00:21

Updated:: 17/Apr/09 00:34