Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4190

Implement replicated join in Spark engine

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • spark-branch
    • spark
    • None

    Description

      Related e2e tests: Union_7, Union_8, Union_13

      Sample script:
      a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
      b = load '/user/pig/tests/data/singlefile/studentcolon10k' using PigStorage(':') as (name, age, gpa);
      c = union a, b;
      d = load '/user/pig/tests/data/singlefile/votertab10k' as (name, age, registration, contributions);
      e = join c by name, d by name using 'replicated';
      store e into '/user/pig/out/praveenr-1411380943-nightly.conf/Union_7.out';

      Attachments

        1. PIG-4190.1.patch
          19 kB
          Mohit Sabharwal
        2. PIG-4190.2.patch
          19 kB
          Mohit Sabharwal
        3. PIG-4190.patch
          18 kB
          Mohit Sabharwal

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mohitsabharwal Mohit Sabharwal
            praveenr019 Praveen Rachabattuni
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment