Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-556

Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records.
      A first version of this was already implemented but was not satisfactory.

      Now we check whether 10sec have passed:

      -after reading a line of Application's stderr
      (so please use as your stderr heartbeat: cerr << ".\n", not cerr << "." )

      -and after outputting a record in mapper / combiner / reducer.

      If 10 sec have passed then we set Reporter status.

      Effects:

      -the reporter status changes more often and provides useful feedback in the Web UI or in another client.
      -a Task will not time out after 10 minutes just because it outputs records slowly.

      No artificial heartbeat is introduced in this proposal.
      The streaming Application still has to show activity (either on stdout or on stderr)

        Attachments

        1. hadoop-timedreporter.patch
          6 kB
          Michel Tourn

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              michel_tourn Michel Tourn
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: