[HIVE-12049] HiveServer2: Provide an option to write serialized thrift objects in final tasks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.0
Fix Version/s: 2.1.0
Component/s: HiveServer2, JDBC
Labels:
- TODOC2.1

Target Version/s:

2.1.0
Hadoop Flags:

Reviewed

Description

For each fetch request to HiveServer2, we pay the penalty of deserializing the row objects and translating them into a different representation suitable for the RPC transfer. In a moderate to high concurrency scenarios, this can result in significant CPU and memory wastage. By having each task write the appropriate thrift objects to the output files, HiveServer2 can simply stream a batch of rows on the wire without incurring any of the additional cost of deserialization and translation.
This can be implemented by writing a new SerDe, which the FileSinkOperator can use to write thrift formatted row batches to the output file. Using the pluggable property of the hive.query.result.fileformat, we can set it to use SequenceFile and write a batch of thrift formatted rows as a value blob. The FetchTask can now simply read the blob and send it over the wire. On the client side, the *DBC driver can read the blob and since it is already formatted in the way it expects, it can continue building the ResultSet the way it does in the current implementation.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-12049.1.patch
16/Jan/16 23:37
71 kB
Vaibhav Gumashta
HIVE-12049.11.patch
04/Mar/16 00:14
155 kB
Rohit Dholakia
HIVE-12049.12.patch
08/Mar/16 01:30
155 kB
Rohit Dholakia
HIVE-12049.13.patch
15/Mar/16 17:06
155 kB
Rohit Dholakia
HIVE-12049.14.patch
16/Mar/16 00:45
155 kB
Rohit Dholakia
HIVE-12049.15.patch
01/Apr/16 21:59
156 kB
Rohit Dholakia
HIVE-12049.16.patch
01/Apr/16 22:30
116 kB
Rohit Dholakia
HIVE-12049.17.patch
01/Apr/16 22:42
158 kB
Rohit Dholakia
HIVE-12049.18.patch
07/Apr/16 17:39
160 kB
Rohit Dholakia
HIVE-12049.19.patch
16/Apr/16 00:51
166 kB
Rohit Dholakia
HIVE-12049.2.patch
18/Jan/16 05:03
91 kB
Vaibhav Gumashta
HIVE-12049.25.patch
19/Apr/16 20:03
163 kB
Rohit Dholakia
HIVE-12049.26.patch
22/Apr/16 00:18
163 kB
Vaibhav Gumashta
HIVE-12049.3.patch
18/Jan/16 10:37
101 kB
Vaibhav Gumashta
HIVE-12049.4.patch
10/Feb/16 23:51
192 kB
Vaibhav Gumashta
HIVE-12049.5.patch
12/Feb/16 02:48
192 kB
Vaibhav Gumashta
HIVE-12049.6.patch
16/Feb/16 22:34
192 kB
Rohit Dholakia
HIVE-12049.7.patch
19/Feb/16 18:26
195 kB
Rohit Dholakia
HIVE-12049.9.patch
25/Feb/16 07:06
191 kB
Vaibhav Gumashta
new-driver-profiles.png
28/Mar/16 18:12
153 kB
Gopal Vijayaraghavan
old-driver-profiles.png
28/Mar/16 20:12
150 kB
Gopal Vijayaraghavan

Issue Links

blocks

HIVE-12428 HiveServer2: Provide an option for HiveServer2 to stream serialized thrift results when they are available

Resolved

is related to

HIVE-10438 HiveServer2: Enable Type specific ResultSet compression

Open

HIVE-14876 make the number of rows to fetch from various HS2 clients/servers configurable

Resolved

links to

Review Board - Rohit

Review Board - Vaibhav

Activity

People

Assignee:: Rohit Dholakia

Reporter:: Rohit Dholakia

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 06/Oct/15 20:55

Updated:: 14/Oct/16 06:32

Resolved:: 22/Apr/16 19:24