Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15289

Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • fs
    • None

    Description

      ViewFS provides flexibility to mount different filesystem types with mount points configuration table. This approach is solving the scalability problems, but users need to reconfigure the filesystem to ViewFS and to its scheme.  This will be problematic in the case of paths persisted in meta stores, ex: Hive. In systems like Hive, it will store uris in meta store. So, changing the file system scheme will create a burden to upgrade/recreate meta stores. In our experience many users are not ready to change that.  

      Router based federation is another implementation to provide coordinated mount points for HDFS federation clusters. Even though this provides flexibility to handle mount points easily, this will not allow other(non-HDFS) file systems to mount. So, this does not solve the purpose when users want to mount external(non-HDFS) filesystems.

      So, the problem here is: Even though many users want to adapt to the scalable fs options available, technical challenges of changing schemes (ex: in meta stores) in deployments are obstructing them.

      So, we propose to allow hdfs scheme in ViewFS like client side mount system and provision user to create mount links without changing URI paths. 
      I will upload detailed design doc shortly.

      Attachments

        1. ViewFSOverloadScheme - V1.0.pdf
          166 kB
          Uma Maheswara Rao G
        2. ViewFSOverloadScheme.png
          186 kB
          Uma Maheswara Rao G

        Issue Links

          1.
          Extend ViewFS and provide ViewFSOverloadScheme implementation with scheme configurable. Sub-task Resolved Uma Maheswara Rao G  
          2.
          Make mount-table to read from central place ( Let's say from HDFS) Sub-task Resolved Uma Maheswara Rao G  
          3.
          Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same. Sub-task Resolved Uma Maheswara Rao G  
          4.
          Make DFSAdmin tool to work with ViewFSOverloadScheme Sub-task Resolved Uma Maheswara Rao G  
          5.
          Document the ViewFSOverloadScheme details in ViewFS guide Sub-task Resolved Uma Maheswara Rao G  
          6.
          DFSAdmin should close filesystem and dfsadmin -setBalancerBandwidth should work with ViewFSOverloadScheme Sub-task Resolved Ayush Saxena  
          7.
          Add all available fs.viewfs.overload.scheme.target.<scheme>.impl classes in core-default.xml bydefault. Sub-task Resolved Uma Maheswara Rao G  
          8.
          Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml Sub-task Resolved Siyao Meng  
          9.
          FSUsage$DF should consider ViewFSOverloadScheme in processPath Sub-task Resolved Uma Maheswara Rao G  
          10.
          ViewFileSystemOverloadScheme should represent mount links as non symlinks Sub-task Resolved Uma Maheswara Rao G  
          11.
          Fix ContentSummary for mount links in ViewFileSystemOverloadScheme Sub-task Resolved Unassigned  
          12.
          Merged ListStatus with Fallback target filesystem and InternalDirViewFS. Sub-task Resolved Uma Maheswara Rao G  
          13.
          mkdirs should work when parent dir is internalDir and fallback configured. Sub-task Resolved Uma Maheswara Rao G  
          14.
          Default mount table name used by ViewFileSystem should be configurable Sub-task Resolved Virajith Jalaparti  
          15.
          create should work when parent dir is internalDir and fallback configured. Sub-task Resolved Uma Maheswara Rao G  
          16.
          Fix NN trash emptier to work if ViewFSOveroadScheme enabled Sub-task Resolved Uma Maheswara Rao G  
          17.
          Optionally ignore port number in mount-table name when picking from initialized uri Sub-task Resolved Uma Maheswara Rao G  
          18.
          ViewFsOverloadScheme should work when -fs option pointing to remote cluster without mount links Sub-task Resolved Uma Maheswara Rao G  
          19.
          ViewFsOverloadScheme should not display error message with "viewfs://" even when it's initialized with other fs. Sub-task Resolved Uma Maheswara Rao G  
          20.
          When Empty mount points, we are assigning fallback link to self. But it should not use full URI for target fs. Sub-task Resolved Uma Maheswara Rao G  
          21.
          mkdirs on fallback should throw IOE out instead of suppressing and returning false Sub-task Resolved Uma Maheswara Rao G  
          22.
          Provide DFS API compatible class(ViewDistributedFileSystem), but use ViewFileSystemOverloadScheme inside Sub-task Resolved Uma Maheswara Rao G  
          23.
          getChildFilesystems should include fallback fs as well Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          24.
          ViewDistributedFileSystem#recoverLease should call super.recoverLease when there are no mounts configured Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          25.
          listFiles on root/InternalDir will fail if fallback root has file Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 40m
          26.
          Fix the rename issues with fallback fs enabled Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 20m
          27.
          ViewDFS#getDelegationToken should not throw UnsupportedOperationException. Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          28.
          ViewHDFS#create(f, permission, cflags, bufferSize, replication, blockSize, progress, checksumOpt) should not be restricted to DFS only. Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          29.
          ViewHDFS#canonicalizeUri should not be restricted to DFS only API. Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          30.
          Namenode trashEmptier should not init ViewFs on startup Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          31.
          ViewFileSystemOverloadScheme support specifying mount table loader imp through conf Sub-task Resolved Junfan Zhang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 40m
          32.
          ViewDistributedFileSystem#rename wrongly using src in the place of dst. Sub-task Resolved Uma Maheswara Rao G

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 40m
          33.
          Fix ViewDFS with mount points for HDFS only API Sub-task Resolved Ayush Saxena

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1.5h
          34.
          Provide FileContext based ViewFSOverloadScheme implementation Sub-task In Progress Abhishek Das

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 40m
          35.
          mkdir should not create dir in fallback if the dir already in mount Path Sub-task Open Uma Maheswara Rao G  
          36.
          DFS cacheadmin, ECAdmin, StoragePolicyAdmin commands should handle ViewFSOverloadScheme Sub-task Open Uma Maheswara Rao G  
          37.
          Implement ViewFsAdmin to list the mount points, target fs for path etc. Sub-task Open Uma Maheswara Rao G  
          38.
          DistCP fails with ViewHDFS and preserveEC options if the actual target path is non HDFS Sub-task Open Uma Maheswara Rao G  
          39.
          Provide documentation for ViewHDFS Sub-task Open Uma Maheswara Rao G  
          40.
          Add resolveMountPath API in FileSystem Sub-task Open Uma Maheswara Rao G  

          Activity

            People

              umamaheswararao Uma Maheswara Rao G
              umamaheswararao Uma Maheswara Rao G
              Votes:
              0 Vote for this issue
              Watchers:
              41 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 19h 20m
                  19h 20m