Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1283

Eliminate internal UTF8 to String and vice versa conversions in the name-node.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.12.0
    • 0.14.0
    • None
    • None

    Description

      We have internal conversions of those two types inside name-node code. One example:
      NameNode.complete(String src, String clientName)
      then it calls
      FSNamesystem.completeFile(new UTF8(src), new UTF8(clientName));
      which in turn finally calls
      FSDirectory.addNode(path.toString(), newNode )
      and in another place
      FSDirectory.getNode(src.toString())

      So we have several conversions of the same parameter back and forth during computation.
      We should keep the parameter type consistent within different methods.

      The question is, which type should be used: String or Text.
      From previous discussions I remember that Text is more efficient in space and time for non ASCII
      data. Here we mostly deal with file names and network addresses, which are ASCII.
      Does it make sense to use Text in this case?

      UTF8 is also used as a key in two maps: pendingCreates and leases.
      This should be replaced too.

      Attachments

        1. EliminateUTF8.patch
          41 kB
          Konstantin Shvachko
        2. EliminateUTF8-2.patch
          43 kB
          Konstantin Shvachko

        Issue Links

          Activity

            People

              shv Konstantin Shvachko
              shv Konstantin Shvachko
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: