Uploaded image for project: 'Chukwa (retired)'
  1. Chukwa (retired)
  2. CHUKWA-146

All hadoop logs should use a different RecordType

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      All hadoop logs are using the same RecordType, so only 1 Reducer is used to process all log files (other than DN,NN,Audit).
      This cause a SKU issue at the M/R level.
      So all hadoop logs should use a different RecordType.

      Note:

      • using the cluster information in the ChukwaRecordPartitioner will also help.
      • using a predefine list of recordType/reducer association will also help by avoiding to have 2 log RecordType going to the same reducer,
        the dynamic affectation ( ( hashCode() & Integer.MAX_VALUE) % numReduceTasks) could be used at a fallback mechanism

      Attachments

        Activity

          People

            Unassigned Unassigned
            jboulon Jerome Boulon
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: