The attached patch includes the following changes:
1) added system_enable_query_details(bool) to the thrift api
2) added "enable query details" and "disable query details" to the CLI
3) added queryDetails state to the ClientState to store the setting per connection
4) o.a.c.service.QueryContext as a container for the per query state. State is only created if the logging details are turned on for the query, and are stored in thread local storage and accessed through static methods.
5) CassandraServer was modified to call QueryContext.startQuery() and stopQuery() to setup and clear the thread local query context. This is done at the entry point for each thrift method that modifies / reads data.
6) o.a.c.concurrent.DebuggableThreadPoolExecutor was modified to copy the QueryContext if present and set it in the worker thread. The changes to the submit() and execute() methods mimic the implementation of java.util.concurrent.ThreadPool . This allows the context to flow from the connection thread to read / mutate thread pools as well as any other cross thread call a query makes.
7) o.a.c.net.Message and MessageDeliveryTask were modified to include the QueryContext in cross node messages. The context is only sent if present AND if the query started on the local node (this stops the context flowing back to the origin). Serialising the QueryContext leverages the fact that the Message will already have the IP address in it.
8) NOTE: The query context will only survive one hop through the cluster.
9) the slf4j-log412 jar has been removed and the code copied into the main tree. When slf4j initialises it looks for an implementation of org.slf4j.impl.StaticLoggerBinder , this will now be the implementation in the source tree rather than the jar in lib.
10) I modified the org.slf4j.impl.Log4jLoggerAdapter implementation, the Log4jLoggerFactory will wrap all log4j loggers in this adapter. The adapter now checks the thread local QueryContext and elevates all TRACE and DEBUG messages to info if query details have been enabled. As well as prepend each messages with either the "query_id@origin_ip" or "query_id@origin_ip message id".
11) Added a line to ColumnFamilyStore to log how many SSTables were read during the query.
Here's an example of the log output for a get() call starting at node2 and calling node1, the prefix is "query 1@/127.0.0.2 - " and "query 1@/127.0.0.2 message 126 -"
INFO [pool-1-thread-1] 2011-01-13 01:39:03,423 CassandraServer.java (line 329) query 1@/127.0.0.2 - get
INFO [pool-1-thread-1] 2011-01-13 01:39:03,458 StorageService.java (line 1391) query 1@/127.0.0.2 - Sorted endpoints are /127.0.0.1
INFO [pool-1-thread-1] 2011-01-13 01:39:03,460 StorageProxy.java (line 377) query 1@/127.0.0.2 - weakread reading SliceByNamesReadCommand(table='Keyspace1', key=666f6f35, columnParent='QueryPath(columnFamilyName='Standard2', superColumnName='null', columnName='null')', columns=[bar,]) from 126@/127.0.0.1
DEBUG [RequestResponseStage:1] 2011-01-13 01:39:03,489 ResponseVerbHandler.java (line 59) Processing response on an async result from 126@/127.0.0.1
INFO [pool-1-thread-1] 2011-01-13 01:39:03,499 QueryContext.java (line 71) query 1@/127.0.0.2 - returning to client, async processing may continue
INFO [ReadStage:3] 2011-01-13 01:39:03,481 ColumnFamilyStore.java (line 1317) query 1@/127.0.0.2 message 126 - SSTables read 1
INFO [ReadStage:3] 2011-01-13 01:39:03,484 ReadVerbHandler.java (line 90) query 1@/127.0.0.2 message 126 - Read key 666f6f35; sending response to 126@/127.0.0.2
NOTE: The query details did not appear in the message on the RequestResponseStage thread because the context was not sent in the return message from node1.
This is my first major ticket, so please let me know of any issues. I chose to replace the slf4j adapter to remove the need to chase down all the log calls and check the QueryContext.