Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-6467

Performance improvement for liststatus on directories in hadoop archives.

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.21.0
    • fs
    • None
    • Reviewed

    Description

      A liststatus call on a directory in hadoop archives leads to ( 2* number of files in directory) open calls to the namenode. This is very sub optimal and needs to be fixed to make it performant enough to be used on a daily basis.

      Attachments

        1. Archives_performance.docx
          111 kB
          Mahadev Konar
        2. Archives_performance.docx
          94 kB
          Mahadev Konar
        3. HADOOP-6467_v3.patch
          4 kB
          Mahadev Konar
        4. HADOOP-6467.patch
          4 kB
          Mahadev Konar
        5. HADOOP-6467.patch
          7 kB
          Mahadev Konar
        6. HADOOP-6467.patch
          6 kB
          Mahadev Konar
        7. HADOOP-6467-v2.patch
          4 kB
          Mahadev Konar
        8. HADOOP-6467-y.0.20-branch-v2.patch
          5 kB
          Mahadev Konar
        9. HADOOP-6467-y.0.20-branch-v2.patch
          4 kB
          Mahadev Konar
        10. HADOOP-6467-y0.20-branch.patch
          4 kB
          Mahadev Konar

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mahadev Mahadev Konar
            mahadev Mahadev Konar
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment