Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4724

GROUP ALL must create an output record in case there is no input

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.16.0
    • None
    • None
    • None

    Description

      A = load 'data';
      
      B = filter A by $0 == 'THIS_DOES_NOT_EXIST';
      
      C = group B ALL;
      
      D = foreach C generate group, COUNT(B);
      

      Even if the filter did not output any rows, since we are grouping on ALL the expected output should probably be (ALL, 0). The implementation generates a pseudo key “all” for every input on map side, thus reduce side we can combine all input together. However, this does not work for 0 input since the reduce side does not get any input. If the input is empty, yield a pseudo “all, 0” to reduce

      Attachments

        Activity

          People

            prkommireddi Prashant Kommireddi
            prkommireddi Prashant Kommireddi
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: