Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13744

OIV tool should better handle control characters present in file or directory names

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
    • Fix Version/s: 3.2.0
    • Component/s: hdfs, tools
    • Labels:
      None
    • Target Version/s:

      Description

      In certain cases when control characters or white space is present in file or directory names OIV tool processors can export data in a misleading format.

      In the below examples we have EXAMPLE_NAME as a file and a directory name where the directory has a line feed character at the end (the actual production case has multiple line feeds and multiple spaces)

      • Delimited processor case:
        • misleading example:
          /user/data/EXAMPLE_NAME
          ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
          /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
          
        • expected example as suggested by https://tools.ietf.org/html/rfc4180#section-2:
          "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
          "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
          
      • XML processor case:
        • misleading example:
          <inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME
          </name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
          
          <inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
          
        • expected example as specified in https://www.w3.org/TR/REC-xml/#sec-line-ends:
          <inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME#xA</name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
          
          <inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
          
      • JSON:
        The OIV Web Processor behaves correctly and produces the following:
        {
          "FileStatuses": {
            "FileStatus": [
              {
                "fileId": 113632535,
                "accessTime": 1494954320141,
                "replication": 3,
                "owner": "user",
                "length": 520,
                "permission": "674",
                "blockSize": 134217728,
                "modificationTime": 1472205657504,
                "type": "FILE",
                "group": "group",
                "childrenNum": 0,
                "pathSuffix": "EXAMPLE_NAME"
              },
              {
                "fileId": 479867791,
                "accessTime": 0,
                "replication": 0,
                "owner": "user",
                "length": 0,
                "permission": "775",
                "blockSize": 0,
                "modificationTime": 1493033668294,
                "type": "DIRECTORY",
                "group": "group",
                "childrenNum": 0,
                "pathSuffix": "EXAMPLE_NAME\n"
              }
            ]
          }
        }
        

        Attachments

        1. HDFS-13744.01.patch
          4 kB
          Zsolt Venczel
        2. HDFS-13744.02.patch
          4 kB
          Zsolt Venczel
        3. HDFS-13744.03.patch
          4 kB
          Sean Mackrory

          Activity

            People

            • Assignee:
              zvenczel Zsolt Venczel
              Reporter:
              zvenczel Zsolt Venczel
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: