Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15915

Race condition with async edits logging due to updating txId outside of the namesystem log

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.0, 2.10.2, 3.2.3, 3.3.2
    • Component/s: hdfs, namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      FSEditLogAsync creates an FSEditLogOp and populates its fields inside FSNamesystem.writeLock. But one essential field the transaction id of the edits op remains unset until the time when the operation is scheduled for synching. At that time beginTransaction() will set the the FSEditLogOp.txid and increment the global transaction count. On busy NameNode this event can fall outside the write lock.
      This causes problems for Observer reads. It also can potentially reshuffle transactions and Standby will apply them in a wrong order.

        Attachments

        1. HDFS-15915-01.patch
          26 kB
          Konstantin Shvachko
        2. HDFS-15915-02.patch
          26 kB
          Konstantin Shvachko
        3. HDFS-15915-03.patch
          26 kB
          Konstantin Shvachko
        4. HDFS-15915-04.patch
          26 kB
          Konstantin Shvachko
        5. HDFS-15915-05.patch
          26 kB
          Konstantin Shvachko
        6. testMkdirsRace.patch
          9 kB
          Konstantin Shvachko

          Issue Links

            Activity

              People

              • Assignee:
                shv Konstantin Shvachko
                Reporter:
                shv Konstantin Shvachko
              • Votes:
                0 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: