Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-9884

Hadoop calling du -sk is expensive

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Done
    • None
    • None
    • None
    • None

    Description

      On numerous occasions we've had customers worry about slowness while hadoop calls du -sk underneath the hood. For most of these users getting the information from df would be sufficient and much faster. In fact there is a hack going around, that is quiet common that replaces df with du. Sometimes people have to tune the vcache. What if we just allowed users to use the df information instead of the du information with a patch and config setting. I'd be glad to code it up

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              posix4e Alex Newman
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: