Pig
  1. Pig
  2. PIG-2760

resources added with a relative path are added to the JobXXXX jar file under their absolute path

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.11, 0.10.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.

      If a pig script contains the following:

      REGISTER etc/foo;

      and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:

      /PATH/TO/DIR/etc/foo

      instead of

      etc/foo

      which was the previous behavior

      1. PIG-2760.patch
        0.9 kB
        Mathias Herberts

        Issue Links

          Activity

          Hide
          Daniel Dai added a comment -

          This is fixed along with PIG-2761. Thanks folks!

          Show
          Daniel Dai added a comment - This is fixed along with PIG-2761 . Thanks folks!
          Hide
          Mathias Herberts added a comment -

          Your patch attached to PIG-2761 Looks Good To Me for what concerns PIG-2760.

          Show
          Mathias Herberts added a comment - Your patch attached to PIG-2761 Looks Good To Me for what concerns PIG-2760 .
          Hide
          Rohini Palaniswamy added a comment -

          Mathias,
          It requires one more minor change. If we just do cp.startsWith(cwd), then even if absolute path was specified and if the script is in a subdirectory under current directory, the jar entry only has the relative path instead of the absolute path. Need to do cp.equals(cwd + "/" + patch).

          I was addressing script loading issue in PIG-2761 and had some modifications for the same line of code. So added your fix to it also and tested and also made it part of PIG-2761 patch. Hope you don't mind. If you could, I would appreciate you reviewing PIG-2761.

          Show
          Rohini Palaniswamy added a comment - Mathias, It requires one more minor change. If we just do cp.startsWith(cwd), then even if absolute path was specified and if the script is in a subdirectory under current directory, the jar entry only has the relative path instead of the absolute path. Need to do cp.equals(cwd + "/" + patch). I was addressing script loading issue in PIG-2761 and had some modifications for the same line of code. So added your fix to it also and tested and also made it part of PIG-2761 patch. Hope you don't mind. If you could, I would appreciate you reviewing PIG-2761 .
          Hide
          Mathias Herberts added a comment -

          So this means we should do the following:

          <code>
          String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
          String cp = f.getCanonicalPath();
          String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
          pigContext.addScriptFile(nameInJar,f.getPath());
          se.registerFunctions(nameInJar, namespace, pigContext);
          </code>

          right?

          Show
          Mathias Herberts added a comment - So this means we should do the following: <code> String cwd = new File(System.getProperty("user.dir")).getCanonicalPath(); String cp = f.getCanonicalPath(); String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1); pigContext.addScriptFile(nameInJar,f.getPath()); se.registerFunctions(nameInJar, namespace, pigContext); </code> right?
          Hide
          Cheolsoo Park added a comment -

          Indeed. Good catch!

          Show
          Cheolsoo Park added a comment - Indeed. Good catch!
          Hide
          Rohini Palaniswamy added a comment -

          *This will cause getScriptAsStream() to error out on the backend as f.getPath() minus leading / will not be in the jar.

          Show
          Rohini Palaniswamy added a comment - *This will cause getScriptAsStream() to error out on the backend as f.getPath() minus leading / will not be in the jar.
          Hide
          Rohini Palaniswamy added a comment -

          <code>
          se.registerFunctions(f.getPath(), namespace, pigContext);
          String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
          String cp = f.getCanonicalPath();
          String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
          pigContext.addScriptFile(nameInJar,f.getPath());
          </code>

          The function is still registered with f.getPath() even though nameInJar is going to be relative to current directory. This will cause getScriptAsStream() on the backend as f.getPath() minus leading / will not be in the jar.

          Show
          Rohini Palaniswamy added a comment - <code> se.registerFunctions(f.getPath(), namespace, pigContext); String cwd = new File(System.getProperty("user.dir")).getCanonicalPath(); String cp = f.getCanonicalPath(); String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1); pigContext.addScriptFile(nameInJar,f.getPath()); </code> The function is still registered with f.getPath() even though nameInJar is going to be relative to current directory. This will cause getScriptAsStream() on the backend as f.getPath() minus leading / will not be in the jar.
          Hide
          Mathias Herberts added a comment -

          I guess with an appropriate comment the two code chunks could be collapsed into one yes.

          Show
          Mathias Herberts added a comment - I guess with an appropriate comment the two code chunks could be collapsed into one yes.
          Hide
          Cheolsoo Park added a comment -

          Hi Mathias,

          Agreed. I haven't thought about the use case that you're describing. Thanks for explaining!

          I like your patch because it solves all the cases that I can think of. Just a minor comment. Can't you collapse the following lines of code into a single line?

          String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp;
          // Strip leading path.sep
          if (nameInJar.startsWith("/")) {
              nameInJar = nameInJar.substring(1);
          }
          

          =>

          String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
          

          Given that cp is always going to be an absolute path (as a relative path is converted to an absolute one by fetchfile()), the "if" condition seems redundant to me. Please correct me if I am wrong.

          Thanks!

          Show
          Cheolsoo Park added a comment - Hi Mathias, Agreed. I haven't thought about the use case that you're describing. Thanks for explaining! I like your patch because it solves all the cases that I can think of. Just a minor comment. Can't you collapse the following lines of code into a single line? String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp; // Strip leading path.sep if (nameInJar.startsWith( "/" )) { nameInJar = nameInJar.substring(1); } => String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1); Given that cp is always going to be an absolute path (as a relative path is converted to an absolute one by fetchfile()), the "if" condition seems redundant to me. Please correct me if I am wrong. Thanks!
          Hide
          Mathias Herberts added a comment -

          This patch changes the name used in the job jar to relative paths if the added resource lies under the current working directory.

          It also strips leading '/' as PIG-2745

          Show
          Mathias Herberts added a comment - This patch changes the name used in the job jar to relative paths if the added resource lies under the current working directory. It also strips leading '/' as PIG-2745
          Hide
          Mathias Herberts added a comment -

          Converting a relative path to an absolute one may be an issue when accessing a resource in a UDF using getResourceAsStream

          Previously, if we used 'REGISTER foo/bar;' in a script, you could access 'bar' by calling this.getClass().getClassLoader().getResourceAsStream('foo/bar'); in your UDF, and this would work whatever directory the pig script is run from.

          If converting the relative path to an absolute one (with no leading '/'), the argument to getResourceAsStream will need to be dependent on the directory from which the pig script is run, this kinds of defeat usability.

          Show
          Mathias Herberts added a comment - Converting a relative path to an absolute one may be an issue when accessing a resource in a UDF using getResourceAsStream Previously, if we used 'REGISTER foo/bar;' in a script, you could access 'bar' by calling this.getClass().getClassLoader().getResourceAsStream('foo/bar'); in your UDF, and this would work whatever directory the pig script is run from. If converting the relative path to an absolute one (with no leading '/'), the argument to getResourceAsStream will need to be dependent on the directory from which the pig script is run, this kinds of defeat usability.
          Hide
          Cheolsoo Park added a comment -

          This is a regression of PIG-2623:

          -        File f = new File(path);
          +        File f = FileLocalizer.fetchFile(pigContext.getProperties(), path).file;
          

          where fetchFile() converts a relative path to absolute path.

          In fact, converting a relative path to an absolute path isn't an issue, but the leading "/" makes registered files not found. That is fixed at PIG-2745.

          Thanks!

          Show
          Cheolsoo Park added a comment - This is a regression of PIG-2623 : - File f = new File(path); + File f = FileLocalizer.fetchFile(pigContext.getProperties(), path).file; where fetchFile() converts a relative path to absolute path. In fact, converting a relative path to an absolute path isn't an issue, but the leading "/" makes registered files not found. That is fixed at PIG-2745 . Thanks!

            People

            • Assignee:
              Rohini Palaniswamy
              Reporter:
              Mathias Herberts
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development