Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4190

Implement replicated join in Spark engine

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • spark-branch
    • spark
    • None

    Description

      Related e2e tests: Union_7, Union_8, Union_13

      Sample script:
      a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
      b = load '/user/pig/tests/data/singlefile/studentcolon10k' using PigStorage(':') as (name, age, gpa);
      c = union a, b;
      d = load '/user/pig/tests/data/singlefile/votertab10k' as (name, age, registration, contributions);
      e = join c by name, d by name using 'replicated';
      store e into '/user/pig/out/praveenr-1411380943-nightly.conf/Union_7.out';

      Attachments

        1. PIG-4190.1.patch
          19 kB
          Mohit Sabharwal
        2. PIG-4190.2.patch
          19 kB
          Mohit Sabharwal
        3. PIG-4190.patch
          18 kB
          Mohit Sabharwal

        Issue Links

          Activity

            People

              mohitsabharwal Mohit Sabharwal
              praveenr019 Praveen Rachabattuni
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: