Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17078

Federated Trash Proposal Review



    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs
    • None


      At Linkedin, we manage a federated HDFS cluster with multiple namenodes using View Filesystem and Router Based Federation in future for mount point management. As we continue to scale, we need to move data across different volumes to rebalance data. We aim to move data without manual effort and as seamlessly as possible from a client's perspective. 

      One problem that came across is the rename/delete operation. In the past, /jobs and /user always stayed on the same volume, delete works fine. As we start to move the directory to a different volume, either the request fails because /user/headless_account does not exist on the destination volume or we need to manually create one. Thus, we want to come up with an organized way to handle trash properly. 

      How trash is handled now:

      When a new headless account is created, a /user/<headless> and a /jobs/<headless> will be auto created. All trash will then be placed at /user/<headless>/.Trash/Current/pathToFile

      We had localized trash work before.

      Each namenode will clean up its own trash periodically. 


      /user/<headless> and /user/<headless> directories are created automatically to the default namenode as part of automation for new user and headless account creation. However, on the new namenode where we move the data to, there will not be such /user/headless precreated. 

      What we propose:

      1. Each namenode will have one trash dir(i.e nn0x:/trash) created manually as part of cluster buildout. Whenever moveToTrash() is called, it will create a corresponding user path underneath(i.e nn0x:/trash/headless/pathToFile) if it does not exist to store trash.
        • An alternative is nn0x:/user/healess/.Trash/Current/pathToFile, which we currently use on nn01. In moveToTrash(), it will create a user directory and a .Trash directory.
          This is not preferred as we think this existing file path with /user/headless prefixed is confusing.
      2. The cleaning mechanism will be extended at namenode level to clean up its local nn0x:/trash directory.
      3. We plan to use the SAME_FILESYSTEM_ACROSS_MOUNTPOINT strategy. We are technically renaming across mount points, but moveToTrash breaks for most directories (if they are mount points), so we believe this is the correct extension of this feature/configuration.
        • Alternatively, a new rename strategy can be created and set by clients to opt in such as fs.viewfs.rename.strategy=FEDERATED_TRASH
      4. For trash residing on the old namenode before the move, we will leave them on the old NN. Clients or Operators of HDFS can go to old NN to find them. 




            myou Melissa You
            myou Melissa You
            0 Vote for this issue
            3 Start watching this issue