Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2459

Cache HAR filesystem metadata

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.23.0
    • harchive
    • None
    • Reviewed

    Description

      Each HAR file system has two index files that contains information on how files are stored in the part files. During the block location calculation, these indexes are reread for every file in the archive. Caching the indexes and the status of the part files will greatly reduce the number of name node operations during the job setup time.

      Attachments

        1. MAPREDUCE-2459.1.patch
          15 kB
          Mac Yang
        2. MAPREDUCE-2459.2.patch
          16 kB
          Mac Yang

        Issue Links

          Activity

            People

              macyang Mac Yang
              macyang Mac Yang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: