Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8964

When validating the edit log, do not read at or beyond the file offset that is being written


    • Target Version/s:


      NN/JN validates in-progress edit log files in multiple scenarios, via EditLogFile#validateLog. The method scans through the edit log file to find the last transaction ID.

      However, an in-progress edit log file could be actively written to, which creates a race condition and causes incorrect data to be read (and later we attempt to interpret the data as ops). This causes problems for INotify, which reads edit log entries while the edit log is still being written.

      Currently validateLog is used in 3 places:

      1. NN getEditsFromTxid
      2. JN getEditLogManifest
      3. NN/JN recoverUnfinalizedSegments

      In the first two scenarios we should provide a maximum TxId to validate in the in-progress file. The 3rd scenario won't cause a race condition because only non-current in-progress edit log files are validated.

      validateLog is actually only used with in-progress files, and could use a better name and Javadoc.


        1. HDFS-8964.06.patch
          22 kB
          Zhe Zhang
        2. HDFS-8964.05.patch
          22 kB
          Zhe Zhang
        3. HDFS-8964.04.patch
          22 kB
          Zhe Zhang
        4. HDFS-8964.03.patch
          22 kB
          Zhe Zhang
        5. HDFS-8964.02.patch
          15 kB
          Zhe Zhang
        6. HDFS-8964.01.patch
          12 kB
          Zhe Zhang
        7. HDFS-8964.00.patch
          13 kB
          Zhe Zhang



            • Assignee:
              zhz Zhe Zhang
              zhz Zhe Zhang
            • Votes:
              0 Vote for this issue
              10 Start watching this issue


              • Created: