Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
To follow up on my recent patches that are helping with "debuggability" of pre-commit hook and our integration tests in general, I've looked into what exactly are we logging that size of our logs is more then 1GB per execution.
Here is what I've done:
- I've applied my patch from
SQOOP-2832to get log for one test only - I've run a magic that gives me classes that are responsible for logging:
cat test/target/surefire-reports/00000_org.apache.sqoop.integration.connector.hdfs.AppendModeTest.test.txt | sed -re "s/^.*\] ([A-Z]+)[ ]+([A-Za-z.]+) .*$/\1 \2/" | sort | uniq -c | sort -r > report
With a top results being:
6927 DEBUG org.apache.sqoop.repository.JdbcRepositoryTransaction 5783 DEBUG org.apache.hadoop.ipc.Client 5752 DEBUG org.apache.sqoop.repository.common.CommonRepositoryHandler 5750 DEBUG org.apache.hadoop.hdfs.DFSClient 4784 DEBUG org.apache.hadoop.hdfs.server.datanode.DataNode 4715 DEBUG org.eclipse.jetty.io.SelectorManager 4660 DEBUG org.eclipse.jetty.server.HttpConnection 4306 DEBUG org.apache.hadoop.security.UserGroupInformation 3489 DEBUG org.eclipse.jetty.io.WriteFlusher 2927 DEBUG org.eclipse.jetty.io.ChannelEndPoint 2846 DEBUG org.apache.hadoop.conf.Configuration 2830 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine 2357 DEBUG org.eclipse.jetty.io.AbstractConnection 2350 DEBUG org.eclipse.jetty.io.SelectChannelEndPoint 2343 DEBUG org.eclipse.jetty.server.HttpChannel 2332 DEBUG org.eclipse.jetty.servlet.ServletHandler 2309 INFO org.apache.sqoop.repository.JdbcRepositoryTransaction 16701 DEBUG org.apache.hadoop.security.SaslInputStream 14613 DEBUG org.eclipse.jetty.http.HttpParser 1426 1175 DEBUG org.apache.sqoop.security.authorization.DefaultAuthorizationValidator 1168 DEBUG org.eclipse.jetty.server.handler.ContextHandler 1168 DEBUG org.eclipse.jetty.server.Server 1168 DEBUG org.eclipse.jetty.server.HttpChannelState 1034 DEBUG org.apache.hadoop.yarn.server.security.ApplicationACLsManager 10329 DEBUG org.apache.hadoop.ipc.Server
Based on that I would like to reconfigure certain classes to limit their logging to levels higher then DEBUG - jetty seems as a no-brainer and Hadoop ipc might be another good candidate.