Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4483

Pig on Tez output statistics shows storing to same directory twice for union

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.14.0
    • Fix Version/s: 0.15.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      For the below script

      A = LOAD 'data1';
      B = LOAD 'data2';
      C = UNION A, B;
      STORE C into 'data3';

      Output message is shown as below due to vertex group and storing from separate vertices.

      Successfully stored 10 records (xxx bytes) in: "data3"
      Successfully stored 20 records (yyy bytes) in: "data3"

      Even though it is correct it can be confusing for users and they have to sum it up before comparing to Pig on MR output message. OutputStats with same filename should be combined and shown as

      Successfully stored 30 records (xxx bytes) in: "data3"

        Attachments

        1. PIG-4483-1.patch
          7 kB
          Rohini Palaniswamy

          Issue Links

            Activity

              People

              • Assignee:
                rohini Rohini Palaniswamy
                Reporter:
                rohini Rohini Palaniswamy
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: