Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32872

BytesToBytesMap at MAX_CAPACITY exceeds growth threshold

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.3, 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.7, 3.0.1
    • Fix Version/s: 2.4.8, 3.0.2, 3.1.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      When BytesToBytesMap is at MAX_CAPACITY and reaches the growth threshold, numKeys >= growthThreshold is true but longArray.size() / 2 < MAX_CAPACITY is false. This correctly prevents the map from growing, but canGrowArray incorrectly remains true. Therefore the map keeps accepting new keys and exceeds its growth threshold. If we attempt to spill the map in this state, the UnsafeKVExternalSorter will not be able to reuse the long array for sorting, causing grouping aggregations to fail with the following error:

      {{2020-09-13 18:33:48,765 ERROR Executor - Exception in task 0.0 in stage 7.0 (TID 69)
      org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 12982025696 bytes of memory, got 0
      at org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:160)
      at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:100)
      at org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:118)
      at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:253)
      at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doConsume_0$(Unknown Source)
      at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithKeys_0$(Unknown Source)
      at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithoutKey_0$(Unknown Source)
      at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
      at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
      at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:733)
      at org.apache.spark.sql.execution.collect.UnsafeRowBatchUtils$.encodeUnsafeRows(UnsafeRowBatchUtils.scala:80)
      at org.apache.spark.sql.execution.collect.Collector.$anonfun$processFunc$1(Collector.scala:187)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
      at org.apache.spark.scheduler.Task.doRunTask(Task.scala:144)
      at org.apache.spark.scheduler.Task.run(Task.scala:117)
      at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$9(Executor.scala:660)
      at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1581)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:663)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)}}

        Attachments

          Activity

            People

            • Assignee:
              ankurd Ankur Dave
              Reporter:
              ankurd Ankur Dave
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: