Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4710

Chukwa - Add duplicate detection, and implement virtual offset of the log file to checkpoint file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None
    • Redhat EL 4.5, Java 1.6, Hadoop Trunk

    Description

      Each data stream has been sent to Chukwa with sequence id, and this sequence id is used as the guide line for tracking duplicate chunk data in Chukwa. However, the check point file does not include the virtual offset. This means when collector crashed, sequence id is reset to zero. Chukwa Agent needs to keep track of the sequence id in the check point file in order to recover from a crash.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eyang Eric Yang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: