Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15915

Race condition with async edits logging due to updating txId outside of the namesystem log

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.4.0, 2.10.2, 3.2.3, 3.3.2
    • hdfs, namenode
    • None
    • Reviewed

    Description

      FSEditLogAsync creates an FSEditLogOp and populates its fields inside FSNamesystem.writeLock. But one essential field the transaction id of the edits op remains unset until the time when the operation is scheduled for synching. At that time beginTransaction() will set the the FSEditLogOp.txid and increment the global transaction count. On busy NameNode this event can fall outside the write lock.
      This causes problems for Observer reads. It also can potentially reshuffle transactions and Standby will apply them in a wrong order.

      Attachments

        1. HDFS-15915-01.patch
          26 kB
          Konstantin Shvachko
        2. HDFS-15915-02.patch
          26 kB
          Konstantin Shvachko
        3. HDFS-15915-03.patch
          26 kB
          Konstantin Shvachko
        4. HDFS-15915-04.patch
          26 kB
          Konstantin Shvachko
        5. HDFS-15915-05.patch
          26 kB
          Konstantin Shvachko
        6. testMkdirsRace.patch
          9 kB
          Konstantin Shvachko

        Issue Links

          Activity

            People

              shv Konstantin Shvachko
              shv Konstantin Shvachko
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: