[HIVE-23175] Skip serializing hadoop and tez config on HS side - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Tez
Labels:
- pull-request-available

Description

HiveServer spends a lot of time serializing configuration objects. We can skip putting hadoop and tez config xml files in payload assuming that the configs are the same on both HS and Task side. This depends on Tez to load local xml configs when creating config objects https://issues.apache.org/jira/browse/TEZ-4137

Ideally we should be able to skip hive-site.xml too. However, if we skip hive-site.xml at that stage, then we make wrong choices at tez dag build stage due to missing configs.

In the ideal version of this, we should not be both looking up configs and putting new configs from and to the same config object at DAG and Vertex build phases. Instead we should be looking up from a HS2's HiveConf object and writing to a brand new JobConf for each vertex. That way we would not have any unnecessary item in the jobconf for any vertex. However Dag and Vertex build stages (TezTask#build) and a lot of other components called from there treat a single config object both the source of HS2 side config and the target JobConf that they are putting vertex level options into. It is very hard to separate these concerns now.

With this patch, we are reducing the size of JobConf (per vertex) by ~65%. It should improve the transmit latency. However, most significant gains are at CPU time while compressing job configs as the config objects are much smaller now.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-23175.1.patch
10/Apr/20 04:29
3 kB
Mustafa İman
HIVE-23175.2.patch
30/Apr/20 06:56
4 kB
Mustafa İman
HIVE-23175.3.patch
09/Jul/20 17:25
4 kB
Mustafa İman

Issue Links

depends upon

TEZ-4137 Input/Output/Processor should merge payload to local conf

Resolved

links to

GitHub Pull Request #1668

Activity

People

Assignee:: Mustafa İman

Reporter:: Mustafa İman

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 10/Apr/20 04:24

Updated:: 20/Jan/21 01:34

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

50m