Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Trivial Trivial
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Created one Umbrella issue and we can link the all log improvement issues to it.

        Issue Links

          Activity

          Gavin made changes -
          Link This issue depends upon HADOOP-6107 [ HADOOP-6107 ]
          Gavin made changes -
          Link This issue depends on HADOOP-6107 [ HADOOP-6107 ]
          Hide
          Harsh J added a comment -

          Regarding Suresh Srinivas's comments, may make more sense if we split up log improvements package-wise instead of class-wise.

          I'm available for splitting up some work to refine log statements (much needed for making ops smile someday). Let me know if you need help.

          Show
          Harsh J added a comment - Regarding Suresh Srinivas 's comments, may make more sense if we split up log improvements package-wise instead of class-wise. I'm available for splitting up some work to refine log statements (much needed for making ops smile someday). Let me know if you need help.
          Hide
          Steve Loughran added a comment -

          Link to HADOOP-6107; machine readable logs

          Show
          Steve Loughran added a comment - Link to HADOOP-6107 ; machine readable logs
          Steve Loughran made changes -
          Link This issue depends on HADOOP-6107 [ HADOOP-6107 ]
          Hide
          Uma Maheswara Rao G added a comment -

          Hi Suresh,

          Thanks a lot for taking a look on this issue.
          Initially i thought to raise the sub task for string concatenation. After i realized that compiler itself will optimize,then I stopped raising the sub tasks.

          Aaron asked to do actual bench mark on that (please see above Aaron comment).After that benchmarking, if i really see some value addition then i will raise them as single patch (as you said).

          Now Steve has already identified some issue under this Jira. So, i am following up and working on them.

          Show
          Uma Maheswara Rao G added a comment - Hi Suresh, Thanks a lot for taking a look on this issue. Initially i thought to raise the sub task for string concatenation. After i realized that compiler itself will optimize,then I stopped raising the sub tasks. Aaron asked to do actual bench mark on that (please see above Aaron comment).After that benchmarking, if i really see some value addition then i will raise them as single patch (as you said). Now Steve has already identified some issue under this Jira. So, i am following up and working on them.
          Hide
          Suresh Srinivas added a comment -

          > Jakob said, Before we get a flood of JIRAs on this (is it really even necessary to have an umbrella JIRA?)
          I concur with Jakob. Why is it necessary to create umbrella jira and small set of changes per class? Can this be accumulated together and committed as one patch?

          Show
          Suresh Srinivas added a comment - > Jakob said, Before we get a flood of JIRAs on this (is it really even necessary to have an umbrella JIRA?) I concur with Jakob. Why is it necessary to create umbrella jira and small set of changes per class? Can this be accumulated together and committed as one patch?
          Hide
          Steve Loughran added a comment -

          link to log4j in JAR issue

          Show
          Steve Loughran added a comment - link to log4j in JAR issue
          Steve Loughran made changes -
          Link This issue incorporates HADOOP-7468 [ HADOOP-7468 ]
          Hide
          Uma Maheswara Rao G added a comment -

          Thanks Steve for taking a look on this issue.

          should we add other log issues under here

          Yes, we can add that issues under here. You want me to convert them as subtasks here? or only reporters for that issues only can do ?

          One more thing I'd like is some common handler for socket setup connections that improve diags

          is it ok if we raise one subtask for this?

          Show
          Uma Maheswara Rao G added a comment - Thanks Steve for taking a look on this issue. should we add other log issues under here Yes, we can add that issues under here. You want me to convert them as subtasks here? or only reporters for that issues only can do ? One more thing I'd like is some common handler for socket setup connections that improve diags is it ok if we raise one subtask for this?
          Hide
          Steve Loughran added a comment -

          One more thing I'd like is some common handler for socket setup connections that improve diags by
          -always printing source and test IP addr and port
          -pointing to a bit of the wiki that is relevant (ConnectionRefused, SocketTimeout) etc

          The goal here is to aid diagnostics and reduce support emails on the list

          Show
          Steve Loughran added a comment - One more thing I'd like is some common handler for socket setup connections that improve diags by -always printing source and test IP addr and port -pointing to a bit of the wiki that is relevant (ConnectionRefused, SocketTimeout) etc The goal here is to aid diagnostics and reduce support emails on the list
          Hide
          Steve Loughran added a comment -

          should we add other log issues under here, e.g.
          Log4J setup HADOOP-6294, HADOOP-1947 , strip out log4.properties from the JARs (I thought I'd filed a bug there against 0.2.203 but can't see it)

          code-level issues: HADOOP-1078 HADOOP-6807 HADOOP-6107

          Show
          Steve Loughran added a comment - should we add other log issues under here, e.g. Log4J setup HADOOP-6294 , HADOOP-1947 , strip out log4.properties from the JARs (I thought I'd filed a bug there against 0.2.203 but can't see it) code-level issues: HADOOP-1078 HADOOP-6807 HADOOP-6107
          Hide
          Arun C Murthy added a comment -

          Let me clarify:

          I'd be more encouraging if you added logs (in DEBUG for e.g.) to show flow through major sections of code etc. So, for e.g. if one turns on 'debug' mode, it's much easier to understand the code...

          Show
          Arun C Murthy added a comment - Let me clarify: I'd be more encouraging if you added logs (in DEBUG for e.g.) to show flow through major sections of code etc. So, for e.g. if one turns on 'debug' mode, it's much easier to understand the code...
          Hide
          Uma Maheswara Rao G added a comment -

          Hi Arun,
          Yes my intention is that, to improve the debuggability.I mentioned that same in my above comment.
          like exceptions missing in logs:

          catch (IOException e) {
          LOG.error("getEditLogSize: editstream.length failed. removing editlog (" +
          idx + ") " + es.getName());

          Here , it will be helpful if we include the exception also in error log messages to identify the problems quickly.

          Show
          Uma Maheswara Rao G added a comment - Hi Arun, Yes my intention is that, to improve the debuggability.I mentioned that same in my above comment. like exceptions missing in logs: catch (IOException e) { LOG.error("getEditLogSize: editstream.length failed. removing editlog (" + idx + ") " + es.getName()); Here , it will be helpful if we include the exception also in error log messages to identify the problems quickly.
          Arun C Murthy made changes -
          Priority Major [ 3 ] Trivial [ 5 ]
          Hide
          Arun C Murthy added a comment -

          Please update bug type only if you see major improvements.

          I'm not against log cleanup, but I'm not a big fan either...

          OTOH, if you are interested in actually improving the logging to make the system more debuggable (especially to show flow in DEBUG/TRACE mode) I'd be more encouraging.

          Show
          Arun C Murthy added a comment - Please update bug type only if you see major improvements. I'm not against log cleanup, but I'm not a big fan either... OTOH, if you are interested in actually improving the logging to make the system more debuggable (especially to show flow in DEBUG/TRACE mode) I'd be more encouraging.
          Hide
          Aaron T. Myers added a comment -

          Hey Uma, please do actual benchmarks to demonstrate that changing the code as you propose actually improves performance. Observing the compiler generating string concatenation code isn't sufficient. Additionally, we should have an actual benchmark so that we can quantify the degree of potential performance improvement, and then make a call regarding whether or not that performance improvement warrants the decrease in code clarity.

          Show
          Aaron T. Myers added a comment - Hey Uma, please do actual benchmarks to demonstrate that changing the code as you propose actually improves performance. Observing the compiler generating string concatenation code isn't sufficient. Additionally, we should have an actual benchmark so that we can quantify the degree of potential performance improvement, and then make a call regarding whether or not that performance improvement warrants the decrease in code clarity.
          Uma Maheswara Rao G made changes -
          Issue Type Bug [ 1 ] Improvement [ 4 ]
          Uma Maheswara Rao G made changes -
          Project Hadoop HDFS [ 12310942 ] Hadoop Common [ 12310240 ]
          Key HDFS-1777 HADOOP-7466
          Issue Type Improvement [ 4 ] Bug [ 1 ]
          Uma Maheswara Rao G made changes -
          Summary HDFS Log improvement Umbrella Hadoop Log improvement Umbrella
          Description In log messages we used so many '+' operators to construct the message. So many string object creations will happen. Instead we can use StringBuilder and append all the strings to it.So, we can avoid unncessary String object creations.

          So created one Umbrella issue and we can link the all log improvement issues to it.
          Created one Umbrella issue and we can link the all log improvement issues to it.
          Component/s name-node [ 12312926 ]
          Component/s data-node [ 12312927 ]
          Hide
          Uma Maheswara Rao G added a comment -

          Thanks to Aaron & Jakob for spending time and giving comments.
          I agree with you, When i checked the compiled code with a decompiler tool, i observed that Java compiler itself appending the strings.

          is it really even necessary to have an umbrella JIRA?),
          My intention is to identify all kind of log related issues and raise as subtasks to this issue ( by grouping similar occurances as one subtask).

          For example:
          catch (IOException e) {
          LOG.error("getEditLogSize: editstream.length failed. removing editlog (" +
          idx + ") " + es.getName());

          Here , it will be helpful if we include the exception also in error log messages to identify the problems quickly.

          Show
          Uma Maheswara Rao G added a comment - Thanks to Aaron & Jakob for spending time and giving comments. I agree with you, When i checked the compiled code with a decompiler tool, i observed that Java compiler itself appending the strings. is it really even necessary to have an umbrella JIRA?), My intention is to identify all kind of log related issues and raise as subtasks to this issue ( by grouping similar occurances as one subtask). For example: catch (IOException e) { LOG.error("getEditLogSize: editstream.length failed. removing editlog (" + idx + ") " + es.getName()); Here , it will be helpful if we include the exception also in error log messages to identify the problems quickly.
          Hide
          Jakob Homan added a comment -

          Before we go whole-hog on this, can we please make sure to do a little benchmarking? Java is pretty good at optimizing for repeated strings, so there may not be much performance to gain here.

          +1. Before we get a flood of JIRAs on this (is it really even necessary to have an umbrella JIRA?), please have some strong numbers to back up the assertion that string concatenation has a significant negative impact on performance. This is particularly the case given our experience with HADOOP-6884.

          Show
          Jakob Homan added a comment - Before we go whole-hog on this, can we please make sure to do a little benchmarking? Java is pretty good at optimizing for repeated strings, so there may not be much performance to gain here. +1. Before we get a flood of JIRAs on this (is it really even necessary to have an umbrella JIRA?), please have some strong numbers to back up the assertion that string concatenation has a significant negative impact on performance. This is particularly the case given our experience with HADOOP-6884 .
          Hide
          Aaron T. Myers added a comment -

          Before we go whole-hog on this, can we please make sure to do a little benchmarking? Java is pretty good at optimizing for repeated strings, so there may not be much performance to gain here.

          Show
          Aaron T. Myers added a comment - Before we go whole-hog on this, can we please make sure to do a little benchmarking? Java is pretty good at optimizing for repeated strings, so there may not be much performance to gain here.
          Uma Maheswara Rao G made changes -
          Field Original Value New Value
          Hadoop Flags [Reviewed]
          Uma Maheswara Rao G created issue -

            People

            • Assignee:
              Uma Maheswara Rao G
              Reporter:
              Uma Maheswara Rao G
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development