[PIG-3288] Kill jobs if the number of output files is over a configurable limit - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Wish
Status: Resolved
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

I ran into a situation where a Pig job tried to create too many files on hdfs and overloaded NN. To prevent such events, it would be nice if we could set a upper limit on the number of files that a Pig job can create.

In fact, Hive has a property called "hive.exec.max.created.files". The idea is that each mapper/reducer increases a counter every time when they create files. Then, MRLauncher periodically checks whether the number of created files so far has exceeded the upper limit. If so, we kill running jobs and exit.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PIG-3288.patch
22/Apr/13 21:26
9 kB
Cheolsoo Park
PIG-3288-2.patch
24/Apr/13 21:30
11 kB
Cheolsoo Park
PIG-3288-3.patch
31/May/13 22:29
11 kB
Cheolsoo Park
PIG-3288-4.patch
10/Jun/13 16:21
11 kB
Cheolsoo Park
PIG-3288-5.patch
23/Jun/13 06:00
13 kB
Cheolsoo Park

Activity

People

Assignee:: Cheolsoo Park

Reporter:: Cheolsoo Park

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 22/Apr/13 21:06

Updated:: 07/Sep/14 03:34

Resolved:: 07/Sep/14 03:34