Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10616

Improve performance of path handling

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: None
    • Component/s: hdfs
    • Labels:
      None
    • Target Version/s:

      Description

      Path handling in the namesystem and directory is very inefficient. The path is repeatedly resolved, decomposed into path components, recombined to a full path. parsed again, throughout the system. This is directly inefficient for general performance, and indirectly via unnecessary pressure on young gen GC.

      The namesystem should only operate on paths, parse it once into inodes, and the directory should only operate on inodes.

        Attachments

        1. 2.6-2.7.1-heap.png
          75 kB
          Daryn Sharp
        1.
        Cache path in InodesInPath Sub-task Resolved Daryn Sharp
        2.
        Optimize conversion from path string to components Sub-task Closed Daryn Sharp
        3.
        Fix path related byte array conversion bugs Sub-task Resolved Daryn Sharp
        4.
        Optimize conversion of byte arrays back to path string Sub-task Resolved Daryn Sharp
        5.
        Optimize UTF8 string/byte conversions Sub-task Resolved Daryn Sharp
        6.
        Optimize FSPermissionChecker's internal path usage Sub-task Resolved Daryn Sharp
        7.
        Optimize creating a full path from an inode Sub-task Resolved Daryn Sharp
        8.
        Optimize FSPermissionChecker group membership check Sub-task Resolved Daryn Sharp
        9.
        Internally optimize path component resolution Sub-task Resolved Daryn Sharp
        10.
        Directly resolve paths into INodesInPath Sub-task Resolved Daryn Sharp
        11.
        Pass IIP for file status related methods Sub-task Resolved Daryn Sharp
        12.
        Optimize mkdir ops Sub-task Resolved Daryn Sharp
        13.
        Reduce byte/string conversions for get listing Sub-task Resolved Daryn Sharp
        14.
        Rename does not need to re-solve destination Sub-task Resolved Daryn Sharp
        15.
        FSDirStatAndListingOp: stop passing path as string Sub-task Resolved Daryn Sharp
        16.
        Cache symlinkString in INodeSymlink Sub-task Patch Available Yiqun Lin
        17.
        Reduce performance penalty of encryption zones Sub-task Resolved Daryn Sharp
        18.
        Pass IIP for FSDirAttr methods Sub-task Resolved Daryn Sharp
        19.
        Remove rename/delete performance penalty when not using snapshots Sub-task Resolved Daryn Sharp
        20.
        Pass IIP for FSDirDeleteOp methods Sub-task Resolved Daryn Sharp
        21.
        Optimize check for existence of parent directory Sub-task Resolved Daryn Sharp
        22.
        Reduce number of path resolving methods Sub-task Resolved Daryn Sharp
        23.
        Reuse iip in unprotectedRemoveXAttrs calls Sub-task Resolved Xiao Chen

          Activity

            People

            • Assignee:
              daryn Daryn Sharp
              Reporter:
              daryn Daryn Sharp
            • Votes:
              0 Vote for this issue
              Watchers:
              24 Start watching this issue

              Dates

              • Created:
                Updated: