Uploaded image for project: 'Apache Nemo'
  1. Apache Nemo
  2. NEMO-54

Handle remote data fetch failures due to executor removal

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.1

    Description

      When an executor is removed, tasks on other executors trying to fetch data from the lost executor run into the INPUT_READ_FAILURE. Because parent task(s) that produced the blocks lost along with the executor are guaranteed to be retried via the EXECUTOR_REMOVED event, we can simply retry just the tasks that ran into INPUT_READ_FAILURE.

       

      Attachments

        Issue Links

          Activity

            People

              johnyangk John Yang
              johnyangk John Yang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: