[TEZ-3240] Improvements to tez.lib.uris to allow for multiple tarballs and mixing tarballs and jars. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.7.2, 0.9.0, 0.8.4
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

Currently, tez.lib.uris only supports either a single archive or paths for multiple jars. You cannot mix and match between the two and you also cannot specify more than one archive. This means that you cannot specify both the tez and mapreduce archives. In the case where there is already a mapreduce archive in the distributed cache, you would not be able to use it when running tez. Instead, you would have to include the mapreduce jars in the single archive that you give to tez.lib.uris or use the mapreduce jars that are on the cluster node itself. This makes it very easy for the mapreduce versions to be out of sync with each other.

With the current implementation, during a rolling upgrade it is very easy to have jobs that do not get the same mapreduce jars across all of the containers, since some will start after the node's jars have been upgraded and some will start before.

If, instead, the job uses a archive that packages both tez and mapreduce together, then you will have 2 copies of the mapreduce jars in the distributed cache and will also have to upgrade both whenever you make a single upgrade to mapreduce.

I propose 2 improvements:
1) Allow tez.lib.uris to take an arbitrary number of archives and jars, while not being limited to only one or the other
2) Allow tez.lib.uris to specify a fragment following the '#' symbol (as is done in mapreduce) that will create a symlink with the name of the fragment.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

TEZ-3240.001.patch
04/May/16 16:03
7 kB
Eric Badger
TEZ-3240.002.patch
09/May/16 16:17
18 kB
Eric Badger
TEZ-3240.003.patch
09/May/16 18:02
19 kB
Eric Badger
TEZ-3240.004.patch
09/May/16 21:03
23 kB
Eric Badger
TEZ-3240.005.patch
10/May/16 13:40
23 kB
Eric Badger
TEZ-3240.006.patch
12/May/16 19:57
23 kB
Eric Badger
TEZ-3240.007.patch
13/May/16 14:35
23 kB
Eric Badger
TEZ-3240.008.patch
16/May/16 14:17
24 kB
Eric Badger
TEZ-3240.009.patch
16/May/16 14:50
23 kB
Eric Badger
TEZ-3240.009.modified.patch
17/May/16 22:00
28 kB
Hitesh Shah
TEZ-3240.009.modified.2.patch
18/May/16 01:26
28 kB
Hitesh Shah
TEZ-3240.010.patch
18/May/16 18:22
30 kB
Hitesh Shah
TEZ-3240-b0.8.010.patch
19/May/16 22:03
31 kB
Eric Badger
TEZ-3240-b0.7.010.patch
19/May/16 22:03
30 kB
Eric Badger

Issue Links

breaks

TEZ-3559 TEZ_LIB_URIS doesn't work with schemes different than the defaultFS

Closed

Activity

People

Assignee:: Eric Badger

Reporter:: Eric Badger

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 04/May/16 15:04

Updated:: 15/Dec/16 16:01

Resolved:: 18/May/16 22:26