Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8964

When validating the edit log, do not read at or beyond the file offset that is being written

    XMLWordPrintableJSON

Details

    Description

      NN/JN validates in-progress edit log files in multiple scenarios, via EditLogFile#validateLog. The method scans through the edit log file to find the last transaction ID.

      However, an in-progress edit log file could be actively written to, which creates a race condition and causes incorrect data to be read (and later we attempt to interpret the data as ops). This causes problems for INotify, which reads edit log entries while the edit log is still being written.

      Currently validateLog is used in 3 places:

      1. NN getEditsFromTxid
      2. JN getEditLogManifest
      3. NN/JN recoverUnfinalizedSegments

      In the first two scenarios we should provide a maximum TxId to validate in the in-progress file. The 3rd scenario won't cause a race condition because only non-current in-progress edit log files are validated.

      validateLog is actually only used with in-progress files, and could use a better name and Javadoc.

      Attachments

        1. HDFS-8964.00.patch
          13 kB
          Zhe Zhang
        2. HDFS-8964.01.patch
          12 kB
          Zhe Zhang
        3. HDFS-8964.02.patch
          15 kB
          Zhe Zhang
        4. HDFS-8964.03.patch
          22 kB
          Zhe Zhang
        5. HDFS-8964.04.patch
          22 kB
          Zhe Zhang
        6. HDFS-8964.05.patch
          22 kB
          Zhe Zhang
        7. HDFS-8964.06.patch
          22 kB
          Zhe Zhang

        Activity

          People

            zhz Zhe Zhang
            zhz Zhe Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: