Hadoop Common
  1. Hadoop Common
  2. HADOOP-8624

ProtobufRpcEngine should log all RPCs if TRACE logging is enabled

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 2.0.2-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: ipc
    • Labels:
      None

      Description

      Since all RPC requests/responses are now ProtoBufs, it's easy to add a TRACE level logging output for ProtobufRpcEngine that actually shows the full content of all calls. This is very handy especially when writing/debugging unit tests, but might also be useful to enable at runtime for short periods of time to debug certain production issues.

        Activity

        Hide
        Todd Lipcon added a comment -

        This patch implements the above for the client side. I didn't instrument the server side with extra tracing as of yet, but we could do that separately. In the context of unit tests where I"m using this, having one side of the connection is enough to be very useful.

        It doesn't include new tests, but here is some example output:

        2012-07-25 17:38:10,313 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(197)) - 46: Call -> null@/127.0.0.1:48275: getJournalState {jid { identifier: "testQuorumJournalManager" }}
        2012-07-25 17:38:10,313 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(197)) - 45: Call -> null@/127.0.0.1:46462: getJournalState {jid { identifier: "testQuorumJournalManager" }}
        2012-07-25 17:38:10,313 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(197)) - 47: Call -> null@/127.0.0.1:60394: getJournalState {jid { identifier: "testQuorumJournalManager" }}
        2012-07-25 17:38:10,354 INFO  server.JournalNode (JournalNode.java:getOrCreateJournal(73)) - Initializing journal in directory build/test/data/dfs/journalnode-0/testQuorumJournalManager
        2012-07-25 17:38:10,354 INFO  server.JournalNode (JournalNode.java:getOrCreateJournal(73)) - Initializing journal in directory build/test/data/dfs/journalnode-1/testQuorumJournalManager
        2012-07-25 17:38:10,354 INFO  server.JournalNode (JournalNode.java:getOrCreateJournal(73)) - Initializing journal in directory build/test/data/dfs/journalnode-2/testQuorumJournalManager
        2012-07-25 17:38:10,361 INFO  common.Storage (Storage.java:analyzeStorage(429)) - Storage directory /home/todd/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/build/test/data/dfs/journalnode-0/testQuorumJournalManager does not exist.
        2012-07-25 17:38:10,361 INFO  common.Storage (Storage.java:analyzeStorage(429)) - Storage directory /home/todd/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/build/test/data/dfs/journalnode-1/testQuorumJournalManager does not exist.
        2012-07-25 17:38:10,361 INFO  common.Storage (Storage.java:analyzeStorage(429)) - Storage directory /home/todd/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/build/test/data/dfs/journalnode-2/testQuorumJournalManager does not exist.
        2012-07-25 17:38:10,363 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getJournalState took 240ms
        2012-07-25 17:38:10,363 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getJournalState took 240ms
        2012-07-25 17:38:10,363 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getJournalState took 240ms
        2012-07-25 17:38:10,364 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(232)) - 47: Response <- null@/127.0.0.1:60394: getJournalState {lastPromisedEpoch: 0 httpPort: 60071}
        2012-07-25 17:38:10,364 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(232)) - 45: Response <- null@/127.0.0.1:46462: getJournalState {lastPromisedEpoch: 0 httpPort: 48186}
        2012-07-25 17:38:10,364 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(232)) - 46: Response <- null@/127.0.0.1:48275: getJournalState {lastPromisedEpoch: 0 httpPort: 50970}
        

        The logs make it very clear what's going on under the hood, which helped me debug an issue.

        Show
        Todd Lipcon added a comment - This patch implements the above for the client side. I didn't instrument the server side with extra tracing as of yet, but we could do that separately. In the context of unit tests where I"m using this, having one side of the connection is enough to be very useful. It doesn't include new tests, but here is some example output: 2012-07-25 17:38:10,313 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(197)) - 46: Call -> null @/127.0.0.1:48275: getJournalState {jid { identifier: "testQuorumJournalManager" }} 2012-07-25 17:38:10,313 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(197)) - 45: Call -> null @/127.0.0.1:46462: getJournalState {jid { identifier: "testQuorumJournalManager" }} 2012-07-25 17:38:10,313 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(197)) - 47: Call -> null @/127.0.0.1:60394: getJournalState {jid { identifier: "testQuorumJournalManager" }} 2012-07-25 17:38:10,354 INFO server.JournalNode (JournalNode.java:getOrCreateJournal(73)) - Initializing journal in directory build/test/data/dfs/journalnode-0/testQuorumJournalManager 2012-07-25 17:38:10,354 INFO server.JournalNode (JournalNode.java:getOrCreateJournal(73)) - Initializing journal in directory build/test/data/dfs/journalnode-1/testQuorumJournalManager 2012-07-25 17:38:10,354 INFO server.JournalNode (JournalNode.java:getOrCreateJournal(73)) - Initializing journal in directory build/test/data/dfs/journalnode-2/testQuorumJournalManager 2012-07-25 17:38:10,361 INFO common.Storage (Storage.java:analyzeStorage(429)) - Storage directory /home/todd/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/build/test/data/dfs/journalnode-0/testQuorumJournalManager does not exist. 2012-07-25 17:38:10,361 INFO common.Storage (Storage.java:analyzeStorage(429)) - Storage directory /home/todd/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/build/test/data/dfs/journalnode-1/testQuorumJournalManager does not exist. 2012-07-25 17:38:10,361 INFO common.Storage (Storage.java:analyzeStorage(429)) - Storage directory /home/todd/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/build/test/data/dfs/journalnode-2/testQuorumJournalManager does not exist. 2012-07-25 17:38:10,363 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getJournalState took 240ms 2012-07-25 17:38:10,363 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getJournalState took 240ms 2012-07-25 17:38:10,363 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getJournalState took 240ms 2012-07-25 17:38:10,364 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(232)) - 47: Response <- null @/127.0.0.1:60394: getJournalState {lastPromisedEpoch: 0 httpPort: 60071} 2012-07-25 17:38:10,364 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(232)) - 45: Response <- null @/127.0.0.1:46462: getJournalState {lastPromisedEpoch: 0 httpPort: 48186} 2012-07-25 17:38:10,364 TRACE ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(232)) - 46: Response <- null @/127.0.0.1:48275: getJournalState {lastPromisedEpoch: 0 httpPort: 50970} The logs make it very clear what's going on under the hood, which helped me debug an issue.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12537936/hadoop-8624.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 javadoc. The javadoc tool appears to have generated -2 warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:

        org.apache.hadoop.ha.TestZKFailoverController

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1221//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1221//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537936/hadoop-8624.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated -2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverController +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1221//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1221//console This message is automatically generated.
        Hide
        Aaron T. Myers added a comment -

        +1, the patch looks good to me. I'm confident that the test failure is unrelated.

        The javadoc warning is a little curious, though. Any explanation for that?

        Show
        Aaron T. Myers added a comment - +1, the patch looks good to me. I'm confident that the test failure is unrelated. The javadoc warning is a little curious, though. Any explanation for that?
        Hide
        Todd Lipcon added a comment -

        The javadoc output must be unrelated - it says I fixed 2 warnings, but I didn't modify any javadoc in this patch at all. I'll commit this momentarily.

        Show
        Todd Lipcon added a comment - The javadoc output must be unrelated - it says I fixed 2 warnings, but I didn't modify any javadoc in this patch at all. I'll commit this momentarily.
        Hide
        Eli Collins added a comment -

        This can be closed right?

        Show
        Eli Collins added a comment - This can be closed right?
        Hide
        Todd Lipcon added a comment -

        Yep, I think this was committed during one of the JIRA outages last month. IT was committed in branch-2@1366128 and trunk@1366127

        Show
        Todd Lipcon added a comment - Yep, I think this was committed during one of the JIRA outages last month. IT was committed in branch-2@1366128 and trunk@1366127

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development