Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-5241

Specify the hdfs path directly to spark and avoid the unnecessary download and upload in SparkLauncher.java

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: spark-branch
    • Component/s: spark
    • Labels:
      None

      Description

      //TODO: Specify the hdfs path directly to spark and avoid the unnecessary download and upload in SparkLauncher.java

        private void cacheFiles(String cacheFiles) throws IOException {
              if (cacheFiles != null && !cacheFiles.isEmpty()) {
                  File tmpFolder = Files.createTempDirectory("cache").toFile();
                  tmpFolder.deleteOnExit();
                  for (String file : cacheFiles.split(",")) {
                      String fileName = extractFileName(file.trim());
                      Path src = new Path(extractFileUrl(file.trim()));
                      File tmpFile = new File(tmpFolder, fileName);
                      Path tmpFilePath = new Path(tmpFile.getAbsolutePath());
                      FileSystem fs = tmpFilePath.getFileSystem(jobConf);
                      //TODO: Specify the hdfs path directly to spark and avoid the unnecessary download and upload in SparkLauncher.java
                      fs.copyToLocalFile(src, tmpFilePath);
                      tmpFile.deleteOnExit();
                      LOG.info(String.format("CacheFile:%s", fileName));
                      addResourceToSparkJobWorkingDirectory(tmpFile, fileName,
                              ResourceType.FILE);
                  }
              }
          }
      

        Attachments

          Activity

            People

            • Assignee:
              nkollar Nándor Kollár
              Reporter:
              kellyzly liyunzhang
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: