Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4483

Pig on Tez output statistics shows storing to same directory twice for union

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.14.0
    • 0.15.0
    • None
    • None
    • Reviewed

    Description

      For the below script

      A = LOAD 'data1';
      B = LOAD 'data2';
      C = UNION A, B;
      STORE C into 'data3';

      Output message is shown as below due to vertex group and storing from separate vertices.

      Successfully stored 10 records (xxx bytes) in: "data3"
      Successfully stored 20 records (yyy bytes) in: "data3"

      Even though it is correct it can be confusing for users and they have to sum it up before comparing to Pig on MR output message. OutputStats with same filename should be combined and shown as

      Successfully stored 30 records (xxx bytes) in: "data3"

      Attachments

        1. PIG-4483-1.patch
          7 kB
          Rohini Palaniswamy

        Issue Links

          Activity

            People

              rohini Rohini Palaniswamy
              rohini Rohini Palaniswamy
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: