Uploaded image for project: 'Chukwa (retired)'
  1. Chukwa (retired)
  2. CHUKWA-533

Improve fault-tolerance of collectors.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • Data Collection
    • None
    • Chukwa collector is more fault-tolerant of partial HDFS outages.

    Description

      There are currently a number of ways that a collector can die, typically due to errors on a DN or a NN that's being restarted. A collector should have some combination of retry logic followed by failing back to the agent, but the collector process should not die.

      Attachments

        1. CHUKWA-533-2.patch
          8 kB
          William W. Graham Jr
        2. CHUKWA-533-1.patch
          5 kB
          William W. Graham Jr

        Activity

          People

            billgraham William W. Graham Jr
            billgraham William W. Graham Jr
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: