Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20953

Add hash map metrics to aggregate and join

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • SQL
    • None

    Description

      It would be useful if we can identify hash map collision issues early on.

      We should add avg hash map probe metric to aggregate operator and hash join operator and report them. If the avg probe is greater than a specific (configurable) threshold, we should log an error at runtime.
      The primary classes to look at are UnsafeFixedWidthAggregationMap, HashAggregateExec, HashedRelation, HashJoin.

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          People

            Unassigned Unassigned
            rxin Reynold Xin
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: