Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41193

Ignore `collect data with single partition larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite` as default

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Tests
    • None

    Description

      Test this suite with *Java 8/11/17* on Linux/MacOS On Apple Silicon with following commands:

      • Maven:

      ```
      build/mvn clean install -DskipTests -pl sql/core -am
      build/mvn clean test -pl sql/core -Dtest=none -DwildcardSuites=org.apache.spark.sql.DatasetLargeResultCollectingSuite
      ```

      and 

       

      ```
      dev/change-scala-version.sh 2.13 
      build/mvn clean install -DskipTests -pl sql/core -am -Pscala-2.13
      build/mvn clean test -pl sql/core -Pscala-2.13 -Dtest=none -DwildcardSuites=org.apache.spark.sql.DatasetLargeResultCollectingSuite
      ```

      • SBT:

      ```
      build/sbt clean "sql/testOnly org.apache.spark.sql.DatasetLargeResultCollectingSuite"
      ```

      ```
      dev/change-scala-version.sh 2.13 
      build/sbt clean "sql/testOnly org.apache.spark.sql.DatasetLargeResultCollectingSuite" -Pscala-2.13
      ```

      All test failed with `java.lang.OutOfMemoryError: Java heap space` as follows:

      ```
      10:19:56.910 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 0.0 (TID 0)
      java.lang.OutOfMemoryError: Java heap space
              at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
              at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
              at org.apache.spark.serializer.SerializerHelper$.$anonfun$serializeToChunkedBuffer$1(SerializerHelper.scala:40)
              at org.apache.spark.serializer.SerializerHelper$.$anonfun$serializeToChunkedBuffer$1$adapted(SerializerHelper.scala:40)
              at org.apache.spark.serializer.SerializerHelper$$$Lambda$2321/1995130077.apply(Unknown Source)
              at org.apache.spark.util.io.ChunkedByteBufferOutputStream.allocateNewChunkIfNeeded(ChunkedByteBufferOutputStream.scala:87)
              at org.apache.spark.util.io.ChunkedByteBufferOutputStream.write(ChunkedByteBufferOutputStream.scala:75)
              at java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1853)
              at java.io.ObjectOutputStream.write(ObjectOutputStream.java:709)
              at org.apache.spark.util.Utils$.$anonfun$writeByteBuffer$1(Utils.scala:271)
              at org.apache.spark.util.Utils$.$anonfun$writeByteBuffer$1$adapted(Utils.scala:271)
              at org.apache.spark.util.Utils$$$Lambda$2324/69671223.apply(Unknown Source)
              at org.apache.spark.util.Utils$.writeByteBufferImpl(Utils.scala:249)
              at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:271)
              at org.apache.spark.util.io.ChunkedByteBuffer.$anonfun$writeExternal$2(ChunkedByteBuffer.scala:103)
              at org.apache.spark.util.io.ChunkedByteBuffer.$anonfun$writeExternal$2$adapted(ChunkedByteBuffer.scala:103)
              at org.apache.spark.util.io.ChunkedByteBuffer$$Lambda$2323/1073743200.apply(Unknown Source)
              at scala.collection.ArrayOps$.foreach$extension(ArrayOps.scala:1328)
              at org.apache.spark.util.io.ChunkedByteBuffer.writeExternal(ChunkedByteBuffer.scala:103)
              at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459)
              at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430)
              at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
              at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
              at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
              at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
              at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
              at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
              at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
              at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
              at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
              at org.apache.spark.serializer.SerializerHelper$.serializeToChunkedBuffer(SerializerHelper.scala:42)
              at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:599)
      ```

      Attachments

        Activity

          People

            LuciferYang Yang Jie
            LuciferYang Yang Jie
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: