Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3917

Speculative task attempt's DMEs can cause downstream fetcher to NPE or duplicate fetch

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.1
    • None
    • None
    • None

    Description

      STA0 , STA1

               |

               |

      DTA0 , DTA1

       

      Take the above example of  DTA0 initially fetching from upstream source task which has 2 attempts, one speculative (say STA1).

      There exists a race where in DME from STA1 comes in to DTA0 and is fetched followed by the fetch from STA0 (the successful one) being marked as duplicate. The DME from STA1 is sent before it is marked as killed by the AM.

      This additional event can also lead to an NPE since fetcher thread is assigned this additional output to be fetched while ShuffleScheduler thinks it has fetched all the mapoutputs since it is not prepared to handle the extra events coming in from the the speculative attempts.

      There are cases where DTA0 NPEs and DTA1 shows duplicate fetches.

      Attachments

        Activity

          People

            kshukla Kuhu Shukla
            kshukla Kuhu Shukla
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: