[HIVE-860] Persistent distributed cache - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Patch Available
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.12.0
Fix Version/s: None
Component/s: None
Labels:
None

Description

DistributedCache is shared across multiple jobs, if the hdfs file name is the same.

We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same.

We can achieve 2 different results:
A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache.
A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache.

A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-860.1.patch
25/Sep/14 03:15
20 kB
Ferdinand Xu
HIVE-860.2.patch
24/Nov/14 04:22
20 kB
Ferdinand Xu
HIVE-860.2.patch
21/Nov/14 16:01
20 kB
Ferdinand Xu
HIVE-860.3.patch
28/Nov/14 06:36
20 kB
Ferdinand Xu
HIVE-860.4.patch
22/Dec/14 20:24
21 kB
Brock Noland
HIVE-860.4.patch
12/Dec/14 06:38
21 kB
Dong Chen
HIVE-860.4.patch
11/Dec/14 01:21
21 kB
Ferdinand Xu
HIVE-860.4.patch
08/Dec/14 08:27
21 kB
Dong Chen
HIVE-860.4.patch
02/Dec/14 00:35
20 kB
Ferdinand Xu
HIVE-860.4.patch
01/Dec/14 11:14
20 kB
Ferdinand Xu
HIVE-860.5.patch
23/Dec/14 17:12
21 kB
Brock Noland
HIVE-860.patch
02/Jul/14 02:34
22 kB
Brock Noland
HIVE-860.patch
01/Jul/14 05:50
22 kB
Brock Noland
HIVE-860.patch
01/Jul/14 02:13
21 kB
Brock Noland
HIVE-860.patch
20/Feb/14 22:06
21 kB
Brock Noland
HIVE-860.patch
19/Feb/14 20:34
21 kB
Brock Noland
HIVE-860.patch
18/Feb/14 21:54
21 kB
Brock Noland
HIVE-860.patch
18/Feb/14 19:07
19 kB
Brock Noland
HIVE-860.patch
18/Feb/14 03:36
19 kB
Brock Noland
HIVE-860.patch
18/Feb/14 03:17
18 kB
Brock Noland
HIVE-860.patch
18/Feb/14 00:45
18 kB
Brock Noland
HIVE-860.patch
18/Feb/14 00:26
17 kB
Brock Noland
HIVE-860-debug.4.patch
10/Dec/14 06:52
21 kB
Dong Chen

Issue Links

is related to

HIVE-5725 Separate out ql code from exec jar

Resolved

relates to

PIG-2672 Optimize the use of DistributedCache

Closed

Activity

People

Assignee:: Dong Chen

Reporter:: Zheng Shao

Votes:: 2 Vote for this issue

Watchers:: 17 Start watching this issue

Dates

Created:: 30/Sep/09 07:46

Updated:: 09/May/15 00:11