[MAPREDUCE-646] distcp should place the file distcp_src_files in distributed cache - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.21.0
Component/s: distcp
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Patch increases the replication factor of _distcp_src_files to sqrt(min(maxMapsOnCluster, totalMapsInThisJob)) sothat many maps won't access the same replica of the file _distcp_src_files at the same time.

Description

When large number of files are being copied by distcp, accessing distcp_src_files seems to be an issue, as all map tasks would be accessing this file. The error message seen is:

09/06/16 10:13:16 INFO mapred.JobClient: Task Id : attempt_200906040559_0110_m_003348_0, Status : FAILED
java.io.IOException: Could not obtain block: blk_-4229860619941366534_1500174
file=/mapredsystem/hadoop/mapredsystem/distcp_7fiyvq/_distcp_src_files
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1757)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1585)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1712)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)
at org.apache.hadoop.tools.DistCp$CopyInputFormat.getRecordReader(DistCp.java:299)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:336)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

This could be because of ~~HADOOP-6038~~ and/or ~~HADOOP-4681~~.

If distcp places this special file distcp_src_files in distributed cache, that could solve the problem.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

d_replica_srcfilelist.patch
19/Jun/09 11:57
2 kB
Ravi Gummadi
d_replica_srcfilelist_v2.patch
25/Jun/09 05:43
3 kB
Ravi Gummadi
d_replica_srcfilelist_v1.patch
19/Jun/09 19:28
2 kB
Ravi Gummadi

Activity

People

Assignee:: Ravi Gummadi

Reporter:: Ravi Gummadi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 17/Jun/09 19:52

Updated:: 24/Aug/10 21:14

Resolved:: 29/Jun/09 18:21