Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2230

"Task not serializable" exception due to non-serializable Codahale Timers

    XMLWordPrintableJSON

Details

    Description

      Steps to reproduce:

      • Enable graphite metrics via props file. Example:
      hoodie.metrics.on=true
      hoodie.metrics.reporter.type=GRAPHITE
      hoodie.metrics.graphite.host=<host>
      hoodie.metrics.graphite.port=<port>
      hoodie.metrics.graphite.metric.prefix=<metrics_prefix>
      
      • Run the Deltastreamer
      • Note the following exception:
      Exception in thread "main" org.apache.hudi.exception.HoodieException: org.apache.hudi.exception.HoodieException: Task not serializable
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:165)
      	at org.apache.hudi.common.util.Option.ifPresent(Option.java:96)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:160)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:501)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
      	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
      	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
      	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
      	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
      	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1038)
      	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1047)
      	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      Caused by: java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Task not serializable
      	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
      	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
      	at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:90)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:163)
      	... 15 more
      Caused by: org.apache.hudi.exception.HoodieException: Task not serializable
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:649)
      	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.spark.SparkException: Task not serializable
      	at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:416)
      	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:406)
      	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
      	at org.apache.spark.SparkContext.clean(SparkContext.scala:2502)
      	at org.apache.spark.rdd.RDD.$anonfun$map$1(RDD.scala:422)
      	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
      	at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
      	at org.apache.spark.rdd.RDD.map(RDD.scala:421)
      	at org.apache.spark.api.java.JavaRDDLike.map(JavaRDDLike.scala:93)
      	at org.apache.spark.api.java.JavaRDDLike.map$(JavaRDDLike.scala:92)
      	at org.apache.spark.api.java.AbstractJavaRDDLike.map(JavaRDDLike.scala:45)
      	...
      	at org.apache.hudi.utilities.sources.RowSource.fetchNewData(RowSource.java:43)
      	at org.apache.hudi.utilities.sources.Source.fetchNext(Source.java:76)
      	at org.apache.hudi.utilities.deltastreamer.SourceFormatAdapter.fetchNewDataInAvroFormat(SourceFormatAdapter.java:69)
      	at org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:399)
      	at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:271)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:625)
      	... 4 more
      Caused by: java.io.NotSerializableException: org.apache.hudi.com.codahale.metrics.Timer
      Serialization stack:
      	- object not serializable (class: org.apache.hudi.com.codahale.metrics.Timer, value: org.apache.hudi.com.codahale.metrics.Timer@e7bcb5c)
      	- field (class: org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics, name: overallTimer, type: class org.apache.hudi.com.codahale.metrics.Timer)
      	- object (class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics, org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics@6704c405)
      	...
      	- element of array (index: 0)
      	- array (class [Ljava.lang.Object;, size 1)
      	- field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, type: class [Ljava.lang.Object;)
      	- object (class java.lang.invoke.SerializedLambda, SerializedLambda[capturingClass=..., functionalInterfaceMethod=org/apache/spark/api/java/function/Function.call:
      

      The important line being:

      - field (class: org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics, name: overallTimer, type: class org.apache.hudi.com.codahale.metrics.Timer)
      

      FIX:

      Make the `Timer` member variables in the `HoodieDeltaStreamerMetrics` class transient so that they are not serialized over the wire.
       

      Attachments

        Issue Links

          Activity

            People

              dave_hagman Dave Hagman
              dave_hagman Dave Hagman
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: