Pig
  1. Pig
  2. PIG-231

validation of files in ship/cache specs

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently the code fails after map reduce job starts if files for ship/cache don't exist.

      We should be able to detect that on the client side.

      For ship, make sure that the file(s) to be shipped exist on the client
      For cache, make sure it exists on the server.

        Activity

        Hide
        Olga Natkovich added a comment -

        Also skippath is not validated.

        Show
        Olga Natkovich added a comment - Also skippath is not validated.
        Hide
        Pi Song added a comment -

        I guess Arun is gonna implement this.

        Could you please separate this new validation logic into a new class?

        We have a place to keep all of these post-validation logics in post-parsing stage in PigType branch. That is we have to migrate this to the new structure when we start merging two branches.

        Show
        Pi Song added a comment - I guess Arun is gonna implement this. Could you please separate this new validation logic into a new class? We have a place to keep all of these post-validation logics in post-parsing stage in PigType branch. That is we have to migrate this to the new structure when we start merging two branches.
        Hide
        Arun C Murthy added a comment -

        Patch to validate ship/cache specs and stream.skippath.

        Pi, I've had to split the validation between StreamingCommand and GruntParser and can't reuse any existing class since ValidatingInputSpec was the only option, but ship/cache specs aren't really FileSpecs... I'd be happy to fix this once your branch becomes mainstream. Does that work? Thanks!

        Show
        Arun C Murthy added a comment - Patch to validate ship/cache specs and stream.skippath. Pi, I've had to split the validation between StreamingCommand and GruntParser and can't reuse any existing class since ValidatingInputSpec was the only option, but ship/cache specs aren't really FileSpecs... I'd be happy to fix this once your branch becomes mainstream. Does that work? Thanks!
        Hide
        Pi Song added a comment -

        OK.

        Patch also looks good.

        Show
        Pi Song added a comment - OK. Patch also looks good.
        Hide
        Olga Natkovich added a comment -

        Arun, a couple of questions/comments:

        (1) For skip path, you need to also check that it is a directory. If this is the only change needed, I can make it.
        (2) It was not clear to me why you could not validate everything in the grunt parser. You should be able to find if file exists on dfs. We already do it for dfs operations such as ls.

        Show
        Olga Natkovich added a comment - Arun, a couple of questions/comments: (1) For skip path, you need to also check that it is a directory. If this is the only change needed, I can make it. (2) It was not clear to me why you could not validate everything in the grunt parser. You should be able to find if file exists on dfs. We already do it for dfs operations such as ls.
        Hide
        Olga Natkovich added a comment -

        Ok, I think I figured it out. The patch is fine.

        Show
        Olga Natkovich added a comment - Ok, I think I figured it out. The patch is fine.
        Hide
        Olga Natkovich added a comment -

        changes committed, thanks arun

        Show
        Olga Natkovich added a comment - changes committed, thanks arun

          People

          • Assignee:
            Arun C Murthy
            Reporter:
            Olga Natkovich
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development