Accumulo
  1. Accumulo
  2. ACCUMULO-488

InputFormats' RecordReaders should call Context.progress

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: 1.3.5, 1.4.0
    • Fix Version/s: 1.4.1
    • Component/s: client
    • Labels:

      Description

      The RecordReader for both input formats never call Context.progress(). This can leave long-running tasks timing out while still making progress.

        Activity

        Hide
        Scott Kuehn added a comment -

        The default timeout is 10 minutes, so the iterators in the InputFormat's RecordReaders would have to exceed this amount when returning a single record (progress is reported implicitly when a mapper receives a record). I suppose AccumuloRowInputFormat is more of a risk if a row has lots of cells, but 10 min still seems like plenty of time.

        Since waiting more than 10 minutes for a single record is somewhat abnormal, users that anticipate this behavior could set the timeout threshold by adjusting the 'mapreduce.task.timeout' variable in their Job. Set it to 0 and the task won't timeout.

        Is there a scenario that I'm missing? If not, I think this ticket could be closed out – perhaps just mentioning the timeout var in the javadocs.

        Show
        Scott Kuehn added a comment - The default timeout is 10 minutes, so the iterators in the InputFormat's RecordReaders would have to exceed this amount when returning a single record (progress is reported implicitly when a mapper receives a record). I suppose AccumuloRowInputFormat is more of a risk if a row has lots of cells, but 10 min still seems like plenty of time. Since waiting more than 10 minutes for a single record is somewhat abnormal, users that anticipate this behavior could set the timeout threshold by adjusting the 'mapreduce.task.timeout' variable in their Job. Set it to 0 and the task won't timeout. Is there a scenario that I'm missing? If not, I think this ticket could be closed out – perhaps just mentioning the timeout var in the javadocs.
        Hide
        Scott Kuehn added a comment -

        Some sample code to experiment with task timeouts: A sleeping iterator and MR driver. Not intended for inclusion.

        Show
        Scott Kuehn added a comment - Some sample code to experiment with task timeouts: A sleeping iterator and MR driver. Not intended for inclusion.
        Hide
        jv added a comment -

        This bug was original brought to my attention as an issue with the entire mapper taking more than the timeout time. If something else will appropriately note progress while the InputFormat is spitting back results, then perhaps this isn't an issue for us. Can you verify this behavior, Scott?

        Show
        jv added a comment - This bug was original brought to my attention as an issue with the entire mapper taking more than the timeout time. If something else will appropriately note progress while the InputFormat is spitting back results, then perhaps this isn't an issue for us. Can you verify this behavior, Scott?
        Hide
        Scott Kuehn added a comment -

        John, I took another look and confirmed the implicit progress reporting by the task/tasktracker. I don't think this is an issue for Accumulo. Given your description above ( a single invocation of the user's map() takes longer than the timeout), best practice is for the user to update custom counters or invoke context.progress(), so the tasktracker knows their job is progressing.

        The sample sleepiterator code and MR job in the attached file can demonstrate the timeout and reset. You can adjust the params in the job to force a timeout (or lack thereof).

        Show
        Scott Kuehn added a comment - John, I took another look and confirmed the implicit progress reporting by the task/tasktracker. I don't think this is an issue for Accumulo. Given your description above ( a single invocation of the user's map() takes longer than the timeout), best practice is for the user to update custom counters or invoke context.progress(), so the tasktracker knows their job is progressing. The sample sleepiterator code and MR job in the attached file can demonstrate the timeout and reset. You can adjust the params in the job to force a timeout (or lack thereof).
        Hide
        jv added a comment -

        Thanks for doing the legwork, Scott

        Show
        jv added a comment - Thanks for doing the legwork, Scott

          People

          • Assignee:
            Scott Kuehn
            Reporter:
            John Vines
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development