Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • fs/s3
    • None

    Description

      Running jstack on a hung process and reading the stack traces is a helpful way to determine exactly what code in the process is stuck. This would be even more helpful if we included more descriptive information about the specific file system method call.

      Attachments

        Issue Links

          Activity

            cnauroth Chris Nauroth added a comment -

            See HDFS-11063 for a similar proposal I've created on the HDFS NameNode side.

            For example, upon entering a public S3A method, we could append useful information to the thread name, such as the user, the start time of the operation and the file path argument. This information would be visible in the thread names when running jstack. That way, we could see that not only is a thread spending a long time in a globStatus call, but also which user is making the call, which path is referenced, and the start time.

            This proposal would get trickier in combination with some of the plans around asynchronous and parallel execution. We'd need to pass along that contextual information to all of the threads that make up the high-level operation.

            It's important that after the S3A operation completes, we restore the prior value of the thread name. S3A is called from user applications that own the lifecycle of the thread. If applications have set meaningful information into the thread name already, then we don't want that to remain changed after the thread exits the S3A code.

            cnauroth Chris Nauroth added a comment - See HDFS-11063 for a similar proposal I've created on the HDFS NameNode side. For example, upon entering a public S3A method, we could append useful information to the thread name, such as the user, the start time of the operation and the file path argument. This information would be visible in the thread names when running jstack . That way, we could see that not only is a thread spending a long time in a globStatus call, but also which user is making the call, which path is referenced, and the start time. This proposal would get trickier in combination with some of the plans around asynchronous and parallel execution. We'd need to pass along that contextual information to all of the threads that make up the high-level operation. It's important that after the S3A operation completes, we restore the prior value of the thread name. S3A is called from user applications that own the lifecycle of the thread. If applications have set meaningful information into the thread name already, then we don't want that to remain changed after the thread exits the S3A code.
            stevel@apache.org Steve Loughran added a comment -

            can you change thread names on the fly? What's the cost?

            username comes with FS instance; we can include both in construction. Hadn't thought about dynamic ops

            stevel@apache.org Steve Loughran added a comment - can you change thread names on the fly? What's the cost? username comes with FS instance; we can include both in construction. Hadn't thought about dynamic ops
            cnauroth Chris Nauroth added a comment -

            Yes, you can change the thread name at any time. The thread name acts somewhat like a mutable thread-local variable, accessed via Thread#getName() and Thread#setName(String). We have some existing precedent for this in the DataNode, where we set the DataXceiver threads with information about the specific data transfer protocol method call and the block ID:

              @Override
              public void readBlock(final ExtendedBlock block,
                  final Token<BlockTokenIdentifier> blockToken,
                  final String clientName,
                  final long blockOffset,
                  final long length,
                  final boolean sendChecksum,
                  final CachingStrategy cachingStrategy) throws IOException {
                previousOpClientName = clientName;
                long read = 0;
                updateCurrentThreadName("Sending block " + block);
                ...
            
              /**
               * Update the current thread's name to contain the current status.
               * Use this only after this receiver has started on its thread, i.e.,
               * outside the constructor.
               */
              private void updateCurrentThreadName(String status) {
                StringBuilder sb = new StringBuilder();
                sb.append("DataXceiver for client ");
                if (previousOpClientName != null) {
                  sb.append(previousOpClientName).append(" at ");
                }
                sb.append(remoteAddress);
                if (status != null) {
                  sb.append(" [").append(status).append("]");
                }
                Thread.currentThread().setName(sb.toString());
              }
            

            I've never observed changing the thread name to cause any significant cost. It would be good to watch out for the same pitfalls as logging, such as avoiding calls to expensive toString implementations with a lot of string concatenation in a tight loop.

            cnauroth Chris Nauroth added a comment - Yes, you can change the thread name at any time. The thread name acts somewhat like a mutable thread-local variable, accessed via Thread#getName() and Thread#setName(String) . We have some existing precedent for this in the DataNode, where we set the DataXceiver threads with information about the specific data transfer protocol method call and the block ID: @Override public void readBlock( final ExtendedBlock block, final Token<BlockTokenIdentifier> blockToken, final String clientName, final long blockOffset, final long length, final boolean sendChecksum, final CachingStrategy cachingStrategy) throws IOException { previousOpClientName = clientName; long read = 0; updateCurrentThreadName( "Sending block " + block); ... /** * Update the current thread's name to contain the current status. * Use this only after this receiver has started on its thread, i.e., * outside the constructor. */ private void updateCurrentThreadName( String status) { StringBuilder sb = new StringBuilder(); sb.append( "DataXceiver for client " ); if (previousOpClientName != null ) { sb.append(previousOpClientName).append( " at " ); } sb.append(remoteAddress); if (status != null ) { sb.append( " [" ).append(status).append( "]" ); } Thread .currentThread().setName(sb.toString()); } I've never observed changing the thread name to cause any significant cost. It would be good to watch out for the same pitfalls as logging, such as avoiding calls to expensive toString implementations with a lot of string concatenation in a tight loop.
            iyonger Yonger added a comment -

            I agree a thread name with meaningful information is a great idea, but I also think logging the information you mentioned(userid, timestamp, and path) is better than to insert into thread name from a comprehensive perspective. You just need to put information into log component and without additional complexity and performance consideration. And with these meaningful information in logs, we also start a trouble shooting for performance problem, also this approach applied in big company with high throughput computer system.

            iyonger Yonger added a comment - I agree a thread name with meaningful information is a great idea, but I also think logging the information you mentioned(userid, timestamp, and path) is better than to insert into thread name from a comprehensive perspective. You just need to put information into log component and without additional complexity and performance consideration. And with these meaningful information in logs, we also start a trouble shooting for performance problem, also this approach applied in big company with high throughput computer system.

            Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a blocker.

            brahmareddy Brahma Reddy Battula added a comment - Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a blocker.
            slfan1989 Shilun Fan added a comment -

            Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a blocker. Retarget 3.5.0.

            slfan1989 Shilun Fan added a comment - Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a blocker. Retarget 3.5.0.

            People

              Unassigned Unassigned
              cnauroth Chris Nauroth
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: