Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5064

Standby checkpoints should not block concurrent readers

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.4.0
    • Component/s: ha, namenode
    • Labels:
      None

      Description

      We've observed an issue which causes fetches of the /jmx page of the NN to take a long time to load when the standby is in the process of creating a checkpoint.

      Even though both creating the checkpoint and gathering the statistics for /jmx take only the FSNS read lock, the issue is that since the FSNS uses a fair RW lock, a single writer attempting to get the lock will block all threads attempting to get only the read lock for the duration of the checkpoint. This will cause /jmx, and really any thread only attempting to get the read lock, to block for the duration of the checkpoint, even though they should be able to proceed concurrently with the checkpointing thread.

        Attachments

        1. HDFS-5064.patch
          13 kB
          Aaron T. Myers
        2. HDFS-5064.patch
          11 kB
          Aaron T. Myers

          Activity

            People

            • Assignee:
              atm Aaron T. Myers
              Reporter:
              atm Aaron T. Myers
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: