Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17216

When distcp handle the small files, the bandwidth parameter will be invalid, resulting in serious overspeed behavior

    XMLWordPrintableJSON

Details

    Description

      When distcp copies small files (file size slightly smaller than the bandwidth), the throbber only starts to throb after 1 second, and the throttled is specific to a single file. so the throbber becomes invalid, causing distcp to fill the cluster bandwidth and crush production traffic, which is a terrible thing.

      Also, it takes time for files to set up the IO pipeline, so you shouldn't test with very small files, which will slow the transfer, especially as bandwidth kicks in, which will amplify the impact of small files on the rate

      Attachments

        1. DiscpAnalyze.jpg
          462 kB
          xiaojunxiang

        Issue Links

          Activity

            People

              bigdata_zoodev xiaojunxiang
              bigdata_zoodev xiaojunxiang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: