Uploaded image for project: 'DataFu'
  1. DataFu
  2. DATAFU-38

BagGroup merges rows

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Labels:
      None

      Description

      load

      1,a,A,1
      1,b,A,2
      1,a,B,3
      2,c,C,4
      2,b,B,5
      2,b,C,6
      

      using tmp_datafu = load 'test' using PigStorage(',') as (id:chararray, domain:chararray, keyword:chararray, weight:int);
      and do

      tmp_roll = foreach (group tmp_datafu by id) generate
        group as id,
        CountEach(tmp_datafu.domain) as domains,
        BagGroup(tmp_datafu.(keyword,weight),tmp_datafu.keyword) as keywords;
      

      the result is

      (1,{(b,1),(a,2)},{(B,{(B,3)}),(A,{(A,1),(A,2)})})
      (2,{(c,1),(b,2)},{(B,{(B,3),(B,5)}),(A,{(A,1),(A,2)}),(C,{(C,4),(C,6)})})
      

      instead of

      (1,{(b,1),(a,2)},{(B,{(B,3)}),(A,{(A,1),(A,2)})})
      (2,{(c,1),(b,2)},{(B,{(B,5)}),(C,{(C,4),(C,6)})})
      

      see also
      http://stackoverflow.com/questions/22945236/how-do-i-accumulate-vectors-into-a-map

        Attachments

        1. BagGroup-38.patch
          3 kB
          Sam Steingold

          Activity

            People

            • Assignee:
              sds Sam Steingold
              Reporter:
              sds Sam Steingold
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: