Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
2.4.0
-
None
-
None
Description
The spark program reports an error when writing the DataFrame to the mysql data table, and the error information is shown in the log.
------------------------------------------------------------------------------------------------------------------
Caused by: java.io.UTFDataFormatException: encoded string too long: 87824 bytes
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:364)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323)
at com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:314)
at com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
at com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:328)
at com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
at com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:328)
at com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
at com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:328)
at com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
at com.typesafe.config.impl.SerializedConfigValue.writeExternal(SerializedConfigValue.java:454)
at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)
--------------------------------------------------------------------------------------------------------------------
I tested it and found that if the "field A" field in the dataframe is removed, it can be written to mysql normally; if "field A" is written, the same error will be reported.
The value of field A is approximately 1000 bytes in length.
If you change the running mode of the program in app-env.sh to local and client mode, you can write to mysql normally; use yarn and cluster mode, and report the error in the attached log.
Is this error related to the cluster version and configuration?