Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6321

Map tasks take a lot of time to start up

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 2.6.0
    • None
    • mrv2

    Description

      I have noticed repeatedly that the map tasks take a lot of time to startup on YARN clusters. This is not the scheduling part, this is after the actual container is launched containing the Map task. Take for example, the sample log from a mapper of a Pi job that I launched. The command I used to launch the Pi job was:

      hadoop jar /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
      

      This is the sample job from one of the mappers which took 14 seconds to complete. If you notice from the logs, most of the time taken by this job is during the start up. I notice that the most mappers take anywhere between 7 to 15 seconds during start up and have seen this behavior consistent across mapreduce jobs. This really affects the performance of short running mappers.

      I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the mapper memory is always specified as 2048m and so on.

      Log:

      2015-04-18 06:48:34,081 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
      2015-04-18 06:48:34,637 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
      2015-04-18 06:48:34,637 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
      2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
      2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1429338752209_0059, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
      2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
      2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
      2015-04-18 06:48:36,706 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
      2015-04-18 06:48:37,387 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
      2015-04-18 06:48:39,388 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
      2015-04-18 06:48:39,448 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
      2015-04-18 06:48:41,060 INFO [main] org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up progress from Task
      2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
      2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
      2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 234881020(939524080)
      2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 896
      2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 657666880
      2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 939524096
      2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 234881020; length = 58720256
      2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
      2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
      2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
      2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 18; bufvoid = 939524096
      2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
      2015-04-18 06:48:44,065 INFO [main] org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector: Returning directory: /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
      2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy]
      2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0
      2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of committing
      2015-04-18 06:48:44,164 INFO [main] org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null comment clearing progress now that task is done
      2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters for attempt_1429338752209_0059_m_000004_0
      2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File System Counters
      2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes read = 0
      2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes written = 116166
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of read operations = 0
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of large read operations = 0
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of write operations = 0
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes read = 297
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes written = 0
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of read operations = 4
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of large read operations = 0
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of write operations = 0
      2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce Framework
      2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input records = 1
      2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output records = 2
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output bytes = 18
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output materialized bytes = 34
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input split bytes = 179
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine input records = 0
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled Records = 2
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed Shuffles = 0
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map outputs = 0
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time elapsed (ms) = 158
      2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time spent (ms) = 2080
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical memory (bytes) snapshot = 1104715776
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual memory (bytes) snapshot = 2900013056
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total committed heap usage (bytes) = 1251475456
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input Format Counters 
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read = 118
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File System Counters
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes read = 0
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes written = 116166
      2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of read operations = 0
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of large read operations = 0
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of write operations = 0
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes read = 297
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes written = 0
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of read operations = 4
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of large read operations = 0
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of write operations = 0
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce Framework
      2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input records = 1
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output records = 2
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output bytes = 18
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output materialized bytes = 34
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input split bytes = 179
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine input records = 0
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled Records = 2
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed Shuffles = 0
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map outputs = 0
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time elapsed (ms) = 158
      2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time spent (ms) = 2080
      2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical memory (bytes) snapshot = 1104715776
      2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual memory (bytes) snapshot = 2900013056
      2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total committed heap usage (bytes) = 1251475456
      2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path: /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
      2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1429338752209_0059_m_000004_0' done.
      2015-04-18 06:48:44,225 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
      2015-04-18 06:48:44,231 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
      2015-04-18 06:48:44,233 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            rjainqb Rajat Jain
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: