Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16984

Directory timestamp lost during the upgrade process

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.10.2, 3.3.6
    • None
    • hdfs

    Description

      Symptoms

      The access timestamp for a directory is lost after the upgrading from HDFS cluster 2.10.2 to 3.3.6.

      Reproduce

      Start up a four-node HDFS cluster in 2.10.2 version.

      Execute the following commands. (The client is started up in NN, We have minimized the command sequence for reproducing)

      bin/hdfs dfs -mkdir /GUBIkxOc
      bin/hdfs dfs -put -f -p -d /tmp/upfuzz/hdfs/GUBIkxOc/bQfxf /GUBIkxOc/
      bin/hdfs dfs -mkdir /GUBIkxOc/sKbTRjvS

      Perform read in the old version

      bin/hdfs dfs -ls     -t  -r -u /GUBIkxOc/
      
      Found 2 items
      drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS
      drwxr-xr-x   - 20001 998                 0 2023-04-17 16:15 /GUBIkxOc/bQfxf

      Then perform a full-stop upgrade to upgrade the entire cluster to 3.3.6. (Follow upgrade procedure in the website: (1) enter safemode (2) rolling upgrade prepare (3) exit from safe mode). When all nodes in new version have started up, we perform the same read:

      Found 2 items
      drwxr-xr-x   - 20001 998                 0 1970-01-01 00:00 /GUBIkxOc/bQfxf
      drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS 

      The access timestamp info of directory /GUBIkxOc/bQfxf is lost. It changes from 2023-04-17 16:15 to 1970-01-01 00:00.

      PS: The prepare upgrade must happen after the commands have been executed.

      I have also attached the required file: /tmp/upfuzz/hdfs/GUBIkxOc/bQfxf

      Root Cause

      When creating the FSImage, the access time field is not persisted.

      If users perform an upgrade without creating the FSImage, this bug won't happen because access time is stored in the Edit Log. However, once FSImage is created, all the edit logs before the snapshot will be invalidated. When the new version system starts up, it only reconstructs the in-memory file system from the FSImage and ignores those edit logs.

      We should make sure the access time of the directory is also properly persisted, just as files. I have submitted a PR for a fix.

      Attachments

        1. GUBIkxOc.tar.gz
          107 kB
          Ke Han

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kehan5800 Ke Han
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: