Pig
  1. Pig
  2. PIG-3749

PigPerformance - data in the map gets lost during parsing

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.12.0
    • Fix Version/s: 0.15.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Bug in PigPerformanceLoader when reading bytes, the loop which looks for a termination character in a map is missing the null value (Ascii=0)

      Description

      Create a Pigmix sample dataset which looks as follow:
      keren 1 2 qt 3 4 5.0 aaaabbbb mccccddddeeeedmffffgggghhhh

      Launch the following query:
      A = load 'page_views_sample.txt' using org.apache.pig.test.pigmix.udf.PigPerformanceLoader()
      as (user, action, timespent, query_term, ip_addr, timestamp, estimated_revenue, page_info, page_links);
      store A into 'L1out_A';

      B = foreach A generate user, (int)action as action, (map[])page_info as page_info, flatten((bag

      {tuple(map[])}

      )page_links) as page_links;
      store B into 'L1out_B';

      The result looks like this:
      keren 1 b#bbb,a#aaa d#,e#eee,c#ccc
      keren 1 b#bbb,a#aaa [f#fff,g#ggg,h#hhh

      It is missing the 'ddd' value and a closing bracket.

      Thanks,
      Keren

      1. PIG-3749.patch
        0.5 kB
        Keren Ouaknine

        Activity

          People

          • Assignee:
            Keren Ouaknine
            Reporter:
            Keren Ouaknine
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development