Uploaded image for project: 'Thrift'
  1. Thrift
  2. THRIFT-4805

Suppress excessive logging of SASL TTransportExceptions in case of END_OF_FILE

    XMLWordPrintableJSON

Details

    Description

      This has to do with the fix checked in for THRIFT-3769 and THRIFT-2268, which was to mute the noise fromĀ TSaslTransportExceptions raised from load-balancer health-checks for Thrift services, such as the Hive metastore. Please consider the code in TThreadPoolServer::run():

      TThreadPoolServer.java
            } catch (TException tx) {
              LOGGER.error("Thrift error occurred during processing of message.", tx);
            } catch (Exception x) {
              // We'll usually receive RuntimeException types here
              // Need to unwrap to ascertain real causing exception before we choose to ignore
              Throwable realCause = x.getCause();
              // Ignore err-logging all transport-level/type exceptions
              if ((realCause != null && realCause instanceof TTransportException)
                  || (x instanceof TTransportException)) {
                LOGGER.debug(
                    "Received TTransportException during processing of message. Ignoring.",
                    x);
              } else {
                // Log the exception at error level and continue
                LOGGER.error("Error occurred during processing of message.", x);
              }
            } finally {...}
      

      The logic here is solid for RuntimeExceptions that wrap TTansportExceptions. But it slips up on handling the condition being checked for at line#323, i.e.:

      TThreadPoolServer.java
                  || (x instanceof TTransportException)) {
      

      The catch (Exception x) comes after the catch (TException tx), so it's a guarantee that !(x instanceof TTransportException). When a TTransportException (or TSaslTransportException) is thrown, it will be caught in the first catch block, and logged. This rather defeats the purpose of the fix. The error manifests with the following stack-trace filling up my logs:

      org.apache.thrift.transport.TTransportException
              at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
              at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
              at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
              at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
              at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
              at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
              at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
              at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
              at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
              at org.apache.hive.service.cli.thrift.ThriftCLIMetricsProcessor.process(ThriftCLIMetricsProcessor.java:63)
              at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$2.run(HadoopThriftAuthBridge.java:777)
              at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$2.run(HadoopThriftAuthBridge.java:773)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
              at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:773)
              at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:748)
      

      The fix is simple. I'll upload it shortly, for review.

      Attachments

        1. THRIFT-4805.1.patch
          2 kB
          Mithun Radhakrishnan
        2. THRIFT-4805.bonus.patch
          3 kB
          Mithun Radhakrishnan

        Issue Links

          Activity

            People

              mithun Mithun Radhakrishnan
              mithun Mithun Radhakrishnan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m