Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14864

Distcp is not called from MoveTask when src is a directory

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3.0
    • Component/s: None
    • Labels:
      None

      Description

      In FileUtils.java the following code does not get executed even when src directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because
      srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We should use srcFS.getContentSummary(src).getLength() instead.

          /* Run distcp if source file/dir is too big */
          if (srcFS.getUri().getScheme().equals("hdfs") &&
              srcFS.getFileStatus(src).getLen() > conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) {
            LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + ")");
            LOG.info("Launch distributed copy (distcp) job.");
            HiveConfUtil.updateJobCredentialProviders(conf);
            copied = shims.runDistCp(src, dst, conf);
            if (copied && deleteSource) {
              srcFS.delete(src, true);
            }
          }
      

        Attachments

        1. HIVE-14864.1.patch
          3 kB
          Sahil Takiar
        2. HIVE-14864.2.patch
          3 kB
          Sahil Takiar
        3. HIVE-14864.3.patch
          4 kB
          Sahil Takiar
        4. HIVE-14864.4.patch
          11 kB
          Sahil Takiar
        5. HIVE-14864.patch
          4 kB
          Sahil Takiar

          Activity

            People

            • Assignee:
              stakiar Sahil Takiar
              Reporter:
              vihangk1 Vihang Karajgaonkar
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: