Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-646

distcp should place the file distcp_src_files in distributed cache


    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: distcp
    • Labels:
    • Hadoop Flags:
    • Release Note:
      Patch increases the replication factor of _distcp_src_files to sqrt(min(maxMapsOnCluster, totalMapsInThisJob)) sothat many maps won't access the same replica of the file _distcp_src_files at the same time.


      When large number of files are being copied by distcp, accessing distcp_src_files seems to be an issue, as all map tasks would be accessing this file. The error message seen is:

      09/06/16 10:13:16 INFO mapred.JobClient: Task Id : attempt_200906040559_0110_m_003348_0, Status : FAILED
      java.io.IOException: Could not obtain block: blk_-4229860619941366534_1500174
      at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1757)
      at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1585)
      at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1712)
      at java.io.DataInputStream.readFully(DataInputStream.java:178)
      at java.io.DataInputStream.readFully(DataInputStream.java:152)
      at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
      at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
      at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
      at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
      at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)
      at org.apache.hadoop.tools.DistCp$CopyInputFormat.getRecordReader(DistCp.java:299)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:336)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
      at org.apache.hadoop.mapred.Child.main(Child.java:170)

      This could be because of HADOOP-6038 and/or HADOOP-4681.

      If distcp places this special file distcp_src_files in distributed cache, that could solve the problem.

      1. d_replica_srcfilelist.patch
        2 kB
        Ravi Gummadi
      2. d_replica_srcfilelist_v1.patch
        2 kB
        Ravi Gummadi
      3. d_replica_srcfilelist_v2.patch
        3 kB
        Ravi Gummadi


        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Ravi Gummadi made changes -
        Release Note Patch increases the replication factor of _distcp_src_files to sqrt(min(maxMapsOnCluster, totalMapsInThisJob)) sothat many maps won't access the same replica of the file _distcp_src_files at the same time.
        Tsz Wo Nicholas Sze made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.21.0 [ 12314045 ]
        Resolution Fixed [ 1 ]
        Tsz Wo Nicholas Sze made changes -
        Hadoop Flags [Reviewed]
        Ravi Gummadi made changes -
        Attachment d_replica_srcfilelist_v2.patch [ 12411747 ]
        Owen O'Malley made changes -
        Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
        Key HADOOP-6072 MAPREDUCE-646
        Affects Version/s 0.21.0 [ 12313563 ]
        Issue Type Improvement [ 4 ] Bug [ 1 ]
        Component/s distcp [ 12312902 ]
        Component/s tools/distcp [ 12312387 ]
        Fix Version/s 0.21.0 [ 12313563 ]
        Ravi Gummadi made changes -
        Attachment d_replica_srcfilelist_v1.patch [ 12411251 ]
        Doug Cutting made changes -
        Comment [ Some comments:
         - 'sleeptime' should be 'getSleeptime()' to be thread safe, no? or maybe use int as a sleep time, since updates to an int are atomic.
         - getNumRunningMaps() is expensive to call from each node at each interval, since reports for all tasks must be retrieved from the JT. better would be to just fetch the job's counters each time, since they're constant-sized, not proportional to the number of tasks. You'd need to add a maps_completed counter, then use the difference between that and TOTAL_LAUNCHED_MAPS to calculate the number running.
         - the interval to contact the JT might be randomized a bit, so that not all tasks hit it at the same time, e.g., by adding a random value that's 10% of the specified value.
         - when InterruptedException is caught a thread should generally exit, not simply log a warning. if things will no longer work correctly without the thread, then it should somehow cause other threads dependent threads to fail too.
         - getNumRunningMaps() should either return a correct value or throw an exception. if it cannot contact the JT or if the task does not know its Id it should fail, no? ]
        Ravi Gummadi made changes -
        Attachment d_replica_srcfilelist.patch [ 12411214 ]
        Ravi Gummadi made changes -
        Field Original Value New Value
        Assignee Ravi Gummadi [ ravidotg ]
        Ravi Gummadi created issue -


          • Assignee:
            Ravi Gummadi
            Ravi Gummadi
          • Votes:
            0 Vote for this issue
            2 Start watching this issue


            • Created: