Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12652

INodeAttributesProvider#getAttributes(): Avoid multiple conversions of path components byte[][] to String[] when requesting INode attributes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0-beta1
    • None
    • hdfs
    • None

    Description

      INodeAttributesProvider#getAttributes needs the path components passed in to be an array of Strings. Where as the INode and related layers maintain path components as an array of byte[]. So, these layers are required to convert each byte[] component of the path back into a string and for multiple times when requesting for INode attributes from the Provider.

      That is, the path "/a/b/c" requires calling the attribute provider with: (1) "", (2) "", "a", (3) "", "a","b", (4) "", "a","b", "c". Every single one of those strings were freshly (re)converted from a byte[]. Say, a file listing is done on a huge directory containing 100s of millions of files, then these multiple time redundant conversions of byte[][] to String[] create lots of tiny object garbages, occupying memory and affecting performance. Better if we could avoid creating redundant copies of path component strings.

      Attachments

        Issue Links

          Activity

            People

              manojg Manoj Govindassamy
              manojg Manoj Govindassamy
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: