Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-6097

Multiple bugs w/ Hadoop archives

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.19.2, 0.20.0, 0.20.1
    • 0.20.2
    • fs
    • None
    • Reviewed
    • Bugs fixed for Hadoop archives: character escaping in paths, LineReader and file system caching.

    Description

      Found and fixed several bugs involving Hadoop archives:

      • In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.
      • It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.
      • har:// connections cannot be indexed by (scheme, authority, username) – the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.

      Attachments

        1. HADOOP-6097.patch
          2 kB
          Ben Slusky
        2. HADOOP-6097-v2.patch
          1 kB
          Thomas White
        3. HADOOP-6097-0.20.patch
          1 kB
          Mahadev Konar
        4. HADOOP-6097-0.20.patch
          4 kB
          Mahadev Konar
        5. HADOOP-6097-0.20.patch
          4 kB
          Mahadev Konar

        Issue Links

          Activity

            People

              sluskyb Ben Slusky
              sluskyb Ben Slusky
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: