Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2950

Poor performance of UnorderedPartitionedKVWriter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Came across a job which was taking a long time in UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data from spill files (8500 spills) and then writing the final compressed merge file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not just buffer and keep directly writing to the final file which will save a lot of time.

      Attachments

        1. TEZ-2950.001_prelim.patch
          17 kB
          Kuhu Shukla

        Activity

          People

            kshukla Kuhu Shukla
            rohini Rohini Palaniswamy
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated: