Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6778

Provide way to limit MRJob's stdout/stderr size

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.7.0
    • None
    • nodemanager
    • None

    Description

      We can run job with huge amount of stdout/stderr and causing undesired consequence.

      The possible solution is to redirect Stdout's and Stderr's output to log4j in YarnChild.java main method.
      In this case System.out and System.err streams will be redirected to log4j logger with appender that will direct output in to stderr or stdout files with needed size limitation. Thereby we are able to limit log's size on the fly, having one backup rolling file (thanks to ContainerRollingLogAppender).

      One of the syslog's size limitation approaches works the same way.

      So, we can set limitation via new properties in mapred-site.xml:
      mapreduce.task.userlog.stderr.limit.kb
      mapreduce.task.userlog.stdout.limit.kb

      Advantages of such solution:

      • it allows us to restrict file sizes during job execution.
      • we can see logs during job execution.

      Disadvantages:

      • It will work only for MRs jobs.

      Is it appropriate solution for solving this problem, or is there something better?

      Attachments

        1. MAPREDUCE-6778.v1.001.patch
          17 kB
          Aleksandr Balitsky

        Issue Links

          Activity

            People

              Unassigned Unassigned
              abalitsky1 Aleksandr Balitsky
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: