Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Now we use KryoSerializer to serialize the jobConf in SparkLauncher. then
deserialize it in ForEachConverter, StreamConverter. We deserialize and serialize the jobConf in order to make jobConf available in spark executor thread.
We can refactor it in following ways:
1. Let spark to broadcast the jobConf in sparkContext.newAPIHadoopRDD. Here not create a new jobConf and load properties from PigContext but directly use jobConf from SparkLauncher.
2. get jobConf in org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark#createRecordReader
Attachments
Attachments
Issue Links
- links to