Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-410

Heartbeating for streaming jobs should not depend on stdout

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.4.0
    • None
    • None
    • Reviewed

    Description

      jobs that require iterative processing may take longer than 10 mins to produce rows. This shouldn't be cause to kill the job. Producing keepalive dummy rows to stdout is bad if the data has to go into a Hive table or other Hive steps.

      If we adopt the solution of using stderr to indicate heartbeats, can that be combined with streaming counters (http://hadoop.apache.org/core/docs/current/streaming.html#How+do+I+update+counters+in+streaming+applications%3F )? Also, will limitations on size of stderr break this?

      Attachments

        1. patch-410.txt
          3 kB
          Ashish Thusoo
        2. patch-410-2.txt
          3 kB
          Ashish Thusoo

        Issue Links

          Activity

            People

              athusoo Ashish Thusoo
              indigoviolet Venky Iyer
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: