Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3197

Confusing log generated by CapacityScheduler

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
      2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
      2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
      2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
      2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...

      1. YARN-3197.001.patch
        1 kB
        Varun Saxena
      2. YARN-3197.002.patch
        1 kB
        Varun Saxena
      3. YARN-3197.003.patch
        2 kB
        Varun Saxena
      4. YARN-3197.004.patch
        2 kB
        Varun Saxena

        Activity

        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/870/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/870/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2067 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2067/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2067 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2067/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #126 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/126/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #126 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/126/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7347 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7347/)
        YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7347 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7347/ ) YARN-3197 . Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Hide
        varun_saxena Varun Saxena added a comment -

        Thanks Devaraj K for the commit and review. Wangda Tan and Vinod Kumar Vavilapalli, thanks for your comments.

        Show
        varun_saxena Varun Saxena added a comment - Thanks Devaraj K for the commit and review. Wangda Tan and Vinod Kumar Vavilapalli , thanks for your comments.
        Hide
        devaraj.k Devaraj K added a comment -

        Committed to trunk and branch-2.

        Thanks Varun Saxena for patch and Wangda Tan for review and discussion.

        Show
        devaraj.k Devaraj K added a comment - Committed to trunk and branch-2. Thanks Varun Saxena for patch and Wangda Tan for review and discussion.
        Hide
        devaraj.k Devaraj K added a comment -

        +1, latest patch looks good to me, will commit it shortly.

        Show
        devaraj.k Devaraj K added a comment - +1, latest patch looks good to me, will commit it shortly.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12704829/YARN-3197.004.patch
        against trunk revision ed4e72a.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6978//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6978//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704829/YARN-3197.004.patch against trunk revision ed4e72a. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6978//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6978//console This message is automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Yeah, I meant no need for printing both.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Yeah, I meant no need for printing both.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Patch looks good to me, I'm OK with both with/without the appId.

        Show
        leftnoteasy Wangda Tan added a comment - Patch looks good to me, I'm OK with both with/without the appId.
        Hide
        varun_saxena Varun Saxena added a comment -

        Will change it back then. AppId was added to aid in quicker debugging

        Show
        varun_saxena Varun Saxena added a comment - Will change it back then. AppId was added to aid in quicker debugging
        Hide
        varun_saxena Varun Saxena added a comment -

        I guess you mean no need for printing both.

        Show
        varun_saxena Varun Saxena added a comment - I guess you mean no need for printing both.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        You can infer ApplicationID from the ContainerID, so need for printing both.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - You can infer ApplicationID from the ContainerID, so need for printing both.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12700831/YARN-3197.003.patch
        against trunk revision 5731c0e.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6740//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6740//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6740//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700831/YARN-3197.003.patch against trunk revision 5731c0e. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6740//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6740//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6740//console This message is automatically generated.
        Hide
        leftnoteasy Wangda Tan added a comment -

        There's one difference, when container id and app id is null, it is
        containerId="xx" completed with status=yyy from completed or unknown application id="zzz"

        And when RMContainer is not null, but cannot find app id, it should be:
        containerId="xx" completed with status=yyy from completed application id="zzz".

        The 2nd one's application shouldn't be indicated as "unknown"

        Show
        leftnoteasy Wangda Tan added a comment - There's one difference, when container id and app id is null, it is containerId="xx" completed with status=yyy from completed or unknown application id="zzz" And when RMContainer is not null, but cannot find app id, it should be: containerId="xx" completed with status=yyy from completed application id="zzz" . The 2nd one's application shouldn't be indicated as "unknown"
        Hide
        varun_saxena Varun Saxena added a comment -

        Wangda Tan, if I change the log like this, there will be no difference between logs for container being null and application being null. Is that fine ?
        Devaraj K, your thoughts on this.

        Show
        varun_saxena Varun Saxena added a comment - Wangda Tan , if I change the log like this, there will be no difference between logs for container being null and application being null. Is that fine ? Devaraj K , your thoughts on this.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Hi Varun Saxena,
        Thanks for reply, I still prefer my suggestion .

        Since the "alive containers map" is too detailed, nobody can understand it unless read RM code.

        Yes application ID can be derived, but user has to translate it, I suggest to print it out directly, when doing log debug, we can just copy and search it in logs, that is why I suggest to include app-Id.

        Does this make sense to you?

        Show
        leftnoteasy Wangda Tan added a comment - Hi Varun Saxena , Thanks for reply, I still prefer my suggestion . Since the "alive containers map" is too detailed, nobody can understand it unless read RM code. Yes application ID can be derived, but user has to translate it, I suggest to print it out directly, when doing log debug, we can just copy and search it in logs, that is why I suggest to include app-Id. Does this make sense to you?
        Hide
        varun_saxena Varun Saxena added a comment -

        Wangda Tan,
        Thought so initially but that was making log too verbose. Maybe can do that to avoid confusion.
        Moreover, I think there should be some differentiation between the log where container is null and the one where application is null.
        I think we can say
        "Container xxx completed with state yyy but was not found in alive containers map"
        Thoughts ?

        Any need to mention the application ID ? That can be derived from container ID. Right ?

        Other log I will change as you suggested.

        Show
        varun_saxena Varun Saxena added a comment - Wangda Tan , Thought so initially but that was making log too verbose. Maybe can do that to avoid confusion. Moreover, I think there should be some differentiation between the log where container is null and the one where application is null. I think we can say "Container xxx completed with state yyy but was not found in alive containers map" Thoughts ? Any need to mention the application ID ? That can be derived from container ID. Right ? Other log I will change as you suggested.
        Hide
        leftnoteasy Wangda Tan added a comment -

        If so, you should indicate this container cannot be found in aliveContainers of SchedulerApplicationAttempt to be more clear.

        Show
        leftnoteasy Wangda Tan added a comment - If so, you should indicate this container cannot be found in aliveContainers of SchedulerApplicationAttempt to be more clear.
        Hide
        varun_saxena Varun Saxena added a comment -

        Hmm...Non-alive was to indicate that its not found in aliveContainers in SchedulerApplicationAttempt.

        Show
        varun_saxena Varun Saxena added a comment - Hmm...Non-alive was to indicate that its not found in aliveContainers in SchedulerApplicationAttempt.
        Hide
        leftnoteasy Wangda Tan added a comment -

        I think "non-alive container" is not correct, since all completed containers reported from NM are "non-alive", I suggest better saying containerId="xx" completed with status=yyy from completed or unknown application id="zzz".

        And I suggest to improve following log a little bit as I suggested:
        https://issues.apache.org/jira/browse/YARN-3197?focusedCommentId=14326344&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14326344

        If a RM can get RMContainer, the application will definitely not "unknown", should indicate the application may be completed as well.

        Show
        leftnoteasy Wangda Tan added a comment - I think "non-alive container" is not correct, since all completed containers reported from NM are "non-alive", I suggest better saying containerId="xx" completed with status=yyy from completed or unknown application id="zzz" . And I suggest to improve following log a little bit as I suggested: https://issues.apache.org/jira/browse/YARN-3197?focusedCommentId=14326344&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14326344 If a RM can get RMContainer, the application will definitely not "unknown", should indicate the application may be completed as well.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12699909/YARN-3197.002.patch
        against trunk revision c33ae27.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6684//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6684//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6684//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699909/YARN-3197.002.patch against trunk revision c33ae27. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6684//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6684//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6684//console This message is automatically generated.
        Hide
        devaraj.k Devaraj K added a comment -

        I would be ok for anyone, better you can add both of them like below.

        "Unknown or non-alive container " + containerId + " completed with ...."

        Show
        devaraj.k Devaraj K added a comment - I would be ok for anyone, better you can add both of them like below. "Unknown or non-alive container " + containerId + " completed with ...."
        Hide
        varun_saxena Varun Saxena added a comment -

        Devaraj K
        Agree that making only log associated with Null container completion as DEBUG doesnt seem right

        So what do you think ? Should I print "Unknown container" or "Non-alive container" ?

        Show
        varun_saxena Varun Saxena added a comment - Devaraj K Agree that making only log associated with Null container completion as DEBUG doesnt seem right So what do you think ? Should I print "Unknown container" or "Non-alive container" ?
        Hide
        devaraj.k Devaraj K added a comment -

        I am not completely convinced to change the log level to debug, even if there are many logs those would be one log per container. If we change the log level to debug then we would be missing the update of those containers after NM restart in the usual cases where the log level is Info. And also there is a debug log in the caller method would probably serve the same and (rmContainer == null) {} log wouldn't be required if you have decided to make the log level as debug.

        LOG.debug("Container FINISHED: " + containerId);
        

        IMO, we don't need to explicitly derive and print the application id from container id, just logging container id would be enough and user can derive application id from it if they really want.

        Show
        devaraj.k Devaraj K added a comment - I am not completely convinced to change the log level to debug, even if there are many logs those would be one log per container. If we change the log level to debug then we would be missing the update of those containers after NM restart in the usual cases where the log level is Info. And also there is a debug log in the caller method would probably serve the same and (rmContainer == null) {} log wouldn't be required if you have decided to make the log level as debug. LOG.debug( "Container FINISHED: " + containerId); IMO, we don't need to explicitly derive and print the application id from container id, just logging container id would be enough and user can derive application id from it if they really want.
        Hide
        varun_saxena Varun Saxena added a comment -

        Devaraj K and others,
        I meant printing unknown container or unknown application while printing their respective IDs' might be deemed as confusing by some too.
        Cant we say something "Non-alive container <containerid> ...." ?

        AppID can probably be printed from ContainerID.
        Thoughts ?

        Show
        varun_saxena Varun Saxena added a comment - Devaraj K and others, I meant printing unknown container or unknown application while printing their respective IDs' might be deemed as confusing by some too. Cant we say something "Non-alive container <containerid> ...." ? AppID can probably be printed from ContainerID. Thoughts ?
        Hide
        sunilg Sunil G added a comment -

        Yes. Remark from Rohith Sharma K S make sense.
        I also came across scenarios where NM was slightly delayed in reporting its status, and application completed in mean time. Lots of this log will be printed on that time.

        Show
        sunilg Sunil G added a comment - Yes. Remark from Rohith Sharma K S make sense. I also came across scenarios where NM was slightly delayed in reporting its status, and application completed in mean time. Lots of this log will be printed on that time.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        I think INFO could be fine since it will be at most once for each container.

        I agree this log message is most once for each containers But IIUC, the above log message would not help to analyze any issue in cluster rather it is just only information. This would come because NodeManger may be delayed in identifying container has finished and sending its status.

        Consider NM restart , NM recovers all the containers and sends all the container status(running and completed) while registering. But application already would have completed and scheduler prints above message which is not really required. It just fills log files.
        May be above scenario can be considered for changing log level to DEBUG.

        Show
        rohithsharma Rohith Sharma K S added a comment - I think INFO could be fine since it will be at most once for each container. I agree this log message is most once for each containers But IIUC, the above log message would not help to analyze any issue in cluster rather it is just only information. This would come because NodeManger may be delayed in identifying container has finished and sending its status. Consider NM restart , NM recovers all the containers and sends all the container status(running and completed) while registering. But application already would have completed and scheduler prints above message which is not really required. It just fills log files. May be above scenario can be considered for changing log level to DEBUG.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Do you see any other info logs coming for the same container?

        No information about container. Its only above log message will be printed.

        Show
        rohithsharma Rohith Sharma K S added a comment - Do you see any other info logs coming for the same container? No information about container. Its only above log message will be printed.
        Hide
        leftnoteasy Wangda Tan added a comment -

        I think it's better not saying

        1276	      LOG.info("Container [ContainerId: " + containerStatus.getContainerId()
        1277	          + "] of unknown application completed with event " + event);
        

        Since we have containerId within containerStatus, it's better to indicate we cannot get RMContainer since the attempt probably is already completed, I suggest print both containerId and applicationId out.\

        I think INFO could be fine since it will be at most once for each container.

        And a logging below is also confusing:

            if (application == null) {
              LOG.info("Container " + container + " of" + " unknown application "
                  + appId + " completed with event " + event);
              return;
            }
        

        If a RM can get RMContainer, the application will definitely not "unknown", should indicate the application may be completed as well.

        Show
        leftnoteasy Wangda Tan added a comment - I think it's better not saying 1276 LOG.info( "Container [ContainerId: " + containerStatus.getContainerId() 1277 + "] of unknown application completed with event " + event); Since we have containerId within containerStatus, it's better to indicate we cannot get RMContainer since the attempt probably is already completed, I suggest print both containerId and applicationId out.\ I think INFO could be fine since it will be at most once for each container. And a logging below is also confusing: if (application == null ) { LOG.info( "Container " + container + " of" + " unknown application " + appId + " completed with event " + event); return ; } If a RM can get RMContainer, the application will definitely not "unknown", should indicate the application may be completed as well.
        Hide
        devaraj.k Devaraj K added a comment -

        rmContainer could be null when SchedulerApplicationAttempt is null or liveContainers doesn't have the container info. There could be a chance of ApplicationAttempt is running and container has already completed(removed from liveContainers). Here we cannot say unknown application.

        I have mentioned 'Unknown container' because RM has removed this container info and doesn't know about this container any more. Do you see any better message here?

        Show
        devaraj.k Devaraj K added a comment - rmContainer could be null when SchedulerApplicationAttempt is null or liveContainers doesn't have the container info. There could be a chance of ApplicationAttempt is running and container has already completed(removed from liveContainers). Here we cannot say unknown application. I have mentioned 'Unknown container' because RM has removed this container info and doesn't know about this container any more. Do you see any better message here?
        Hide
        varun_saxena Varun Saxena added a comment -

        Hmm...But we do have Container ID. Would it be right to say Unknown container if we are printing ContainerID ?
        We do not know the Application ID however.

        Show
        varun_saxena Varun Saxena added a comment - Hmm...But we do have Container ID. Would it be right to say Unknown container if we are printing ContainerID ? We do not know the Application ID however.
        Hide
        devaraj.k Devaraj K added a comment -

        Thanks Varun Saxena for patch and Rohith Sharma K S for comment.

        +          + "] of unknown application completed with event " + event);
        

        Here 'unknown application' may not be appropriate always. Instead can we think of logging like '"Unknown container " + containerStatus.getContainerId() + " completed with event " + event'.

        It would be better if the log level changed to DEBUG. In NM restart, these messages are very huge

        Do you see any other info logs coming for the same container? IMO, it should have at least one info log about this container status update from NM, after NM restart.

        Show
        devaraj.k Devaraj K added a comment - Thanks Varun Saxena for patch and Rohith Sharma K S for comment. + + "] of unknown application completed with event " + event); Here 'unknown application' may not be appropriate always. Instead can we think of logging like '"Unknown container " + containerStatus.getContainerId() + " completed with event " + event'. It would be better if the log level changed to DEBUG. In NM restart, these messages are very huge Do you see any other info logs coming for the same container? IMO, it should have at least one info log about this container status update from NM, after NM restart.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        It would be better if the log level changed to DEBUG. In NM restart, these messages are very huge

        Show
        rohithsharma Rohith Sharma K S added a comment - It would be better if the log level changed to DEBUG. In NM restart, these messages are very huge
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12699282/YARN-3197.001.patch
        against trunk revision f0412de.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6651//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6651//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6651//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699282/YARN-3197.001.patch against trunk revision f0412de. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6651//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6651//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6651//console This message is automatically generated.
        Hide
        varun_saxena Varun Saxena added a comment -

        Yes, will do that. Thanks Devaraj K for the suggestion.

        Show
        varun_saxena Varun Saxena added a comment - Yes, will do that. Thanks Devaraj K for the suggestion.
        Hide
        devaraj.k Devaraj K added a comment -
          protected synchronized void completedContainer(RMContainer rmContainer,
              ContainerStatus containerStatus, RMContainerEventType event) {
            if (rmContainer == null) {
              LOG.info("Null container completed...");
              return;
            }
        

        Here this log can be updated with containerId from ContainerStatus along with the some meaningful message.

        Show
        devaraj.k Devaraj K added a comment - protected synchronized void completedContainer(RMContainer rmContainer, ContainerStatus containerStatus, RMContainerEventType event) { if (rmContainer == null) { LOG.info( "Null container completed..." ); return; } Here this log can be updated with containerId from ContainerStatus along with the some meaningful message.

          People

          • Assignee:
            varun_saxena Varun Saxena
            Reporter:
            hitesh Hitesh Shah
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development