Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1690

Fix StackOverflowError while running clustering with large number of partitions

    XMLWordPrintableJSON

Details

    Description

      We are testing clustering on a hudi table with about 3000 partitions. The spark driver throws StackOverflowError before all the partitions sorted:

      21/03/11 19:51:20 ERROR [main] UtilHelpers: Cluster failed
      java.lang.StackOverflowError
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)
      at java.io.ObjectStreamClass.invokeWriteReplace(ObjectStreamClass.java:1118)
      at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1136)
      at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
      at org.apache.spark.RangePartitioner.$anonfun$writeObject$1(Partitioner.scala:261)
      at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
      at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1343)
      at org.apache.spark.RangePartitioner.writeObject(Partitioner.scala:254)
      at sun.reflect.GeneratedMethodAccessor201.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)
      at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
      at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
      at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
      at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
      at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
      at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
      at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
      at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
      at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
      at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
      at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
      at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
      at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
      at scala.collection.immutable.List$SerializationProxy.writeObject(List.scala:477)
      at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)
      at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
      at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
      at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
      at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
      at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

      ...

       

      I see similar issue here:

      https://stackoverflow.com/questions/30522564/spark-when-union-a-lot-of-rdd-throws-stack-overflow-error

      Setting the driver's stack size to 100M still has this error. So this is probably because the rdd.union has been called too many times and the result of rdd lineage is too large. I think we should use JavaSparkContext.union instead RDD.union here https://github.com/apache/hudi/blob/e93c6a569310ce55c5a0fc0655328e7fd32a9da2/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/cluster/SparkExecuteClusteringCommitActionExecutor.java#L96

      Attachments

        Issue Links

          Activity

            People

              satishkotha satish
              rongma Rong Ma
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: