Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-744

Refactor ETL process for HBaseWriter

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: Data Processors
    • Labels:
      None

      Description

      The current ETL classes are based on Demux MapProcessor and ReduceProcessor. The processors were designed to pass in archive key embedded in the processor as well as ChunkSaver to preserve chunks that can not be parsed. This is fine when running map reduce based demux job for processing data. The short lived task will spill out ChunkSaver into separate file for examination later. However, the processors can generate memory leaks for long period of time in Chukwa agent because Chunks are saved in ChukwaSaver without clean up.

      It would be better to redesign the parser classes with well defined behavior. If the chunk can not be parsed, it should throw ParseException to upper layer for retry or log to agent log.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                eyang Eric Yang
                Reporter:
                eyang Eric Yang
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: