Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3680

Optimizations to UnorderedPartitionedKVWriter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.9.0
    • None
    • None

    Description

      1. Consider increasing the number of threads in spill executor. TEZ_RUNTIME_UNORDERED_OUTPUT_MAX_PER_BUFFER_SIZE_BYTES can be used to configure the buffer size. If smaller buffer sizes are provided, there is a chance of getting frequent spills; currently the spill executor operates in single threaded mode.

      2. During profiling, things like incrementing the counters, notifying progress came up. This may not be common in regular tez jobs. But in processes like LLAP (hive based), it is possible to get into such situations. I will attach the profiler snapshot showing this. It would be good to update/notify less frequently.

      3. Optimize mergeAll().

      Attachments

        1. TEZ-3680.7.patch
          13 kB
          Rajesh Balamohan
        2. TEZ-3680.6.patch
          10 kB
          Rajesh Balamohan
        3. TEZ-3680.5.patch
          11 kB
          Rajesh Balamohan
        4. TEZ-3680.4.patch
          10 kB
          Rajesh Balamohan
        5. TEZ-3680.3.patch
          8 kB
          Rajesh Balamohan
        6. TEZ-3680.2.patch
          9 kB
          Rajesh Balamohan
        7. TEZ-3680.1.patch
          7 kB
          Rajesh Balamohan
        8. profiler.png
          251 kB
          Rajesh Balamohan

        Activity

          People

            rajesh.balamohan Rajesh Balamohan
            rajesh.balamohan Rajesh Balamohan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: