Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2605

[Umbrella] CPU optimizations for hotspots

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      There are some CPU hotpots when processing large data when IO is highly optimized:

      1. Sort : memory access to compare with 2 values can be bottleneck.
      2. Aggregation : hash construction for UnorderedPartitionedKVWriter can be bottleneck.
      3. Filter : memory access to compare the key values with given condition.

      This issue is a umbrella jira for CPU optmizations at Tez side.

      Related works:
      Alphasort: http://dl.acm.org/citation.cfm?id=615237
      Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited: https://cs.uwaterloo.ca/~tozsu/publications/other/p168-balkesen.pdf

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ozawa Tsuyoshi Ozawa
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: