Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1731

OnDiskMerger can end up clobbering files across tasks with LocalDiskFetch

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.5.2
    • None
    • None

    Description

      When an on disk fetch starts with LOCAL files (optimize.local.fetch), the filename used by the merger is based on the source file name. This name can be the same for all tasks reading the same input on the node - and can result in files being overwritten between tasks, depending on the order in which events are processed, and the dir allocated by the local dir-allocator.

      Leads to ChecksumExceptions, and FileNotFoundExceptions during the merge.

      Attachments

        1. TEZ-1731.1.txt
          25 kB
          Siddharth Seth
        2. TEZ-1731.2.txt
          24 kB
          Siddharth Seth

        Issue Links

          Activity

            People

              sseth Siddharth Seth
              sseth Siddharth Seth
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: