[PIG-4970] Remove the deserialize and serialization of JobConf in code for spark mode - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: spark-branch
Component/s: spark
Labels:
None

Description

Now we use KryoSerializer to serialize the jobConf in SparkLauncher. then
deserialize it in ForEachConverter, StreamConverter. We deserialize and serialize the jobConf in order to make jobConf available in spark executor thread.

We can refactor it in following ways:
1. Let spark to broadcast the jobConf in sparkContext.newAPIHadoopRDD. Here not create a new jobConf and load properties from PigContext but directly use jobConf from SparkLauncher.
2. get jobConf in org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark#createRecordReader

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PIG-4970.patch
18/Aug/16 08:06
33 kB
liyunzhang
PIG-4970_4.patch
24/Aug/16 03:41
41 kB
liyunzhang
PIG-4970_3.patch
24/Aug/16 03:22
40 kB
liyunzhang
PIG-4970_2.patch
23/Aug/16 09:00
37 kB
liyunzhang

Issue Links

links to

review board

Activity

People

Assignee:: liyunzhang

Reporter:: liyunzhang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 12/Aug/16 05:37

Updated:: 21/Jun/17 09:18

Resolved:: 24/Aug/16 16:04