[HAMA-757] The partitioning job output should be un-splitable - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.6.1
Fix Version/s: 0.6.2
Component/s: bsp core
Labels:
None

Description

When the output sequence files from partitioning job are large(bigger than two hdfs file block size), the second round of the job (using these sequence file as input) will start up more tasks than client want. Some times, this uncertainty make the job exceed the cluster slot capacity.
In the real project, I implemented an new Inputformat which marked as un-splitable to solve the problem. Is there any better way?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HAMA-757.patch
14/May/13 11:54
2 kB
MaoYuan Xian

Activity

People

Assignee:: MaoYuan Xian

Reporter:: MaoYuan Xian

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/May/13 06:33

Updated:: 16/May/13 04:56

Resolved:: 15/May/13 04:32