Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-770

Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 0.2.0
    • None
    • None
    • None

    Description

      Loading the 2 following as:

      urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, pg:bytearray);
      siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, expanded_site:chararray, url:bytearray);

      then the following:

      urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, pg:chararray),
      FLATTEN(siteUrls.(site, expanded_site));

      works as expected.

      But all the rest fail with an error message that does not make sense (to me)

      urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, pg:chararray),
      FLATTEN(siteUrls.site);

      2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Invalid alias: siteUrls::site in

      {url: chararray,pg: chararray,site: chararray}

      urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, pg:chararray),
      FLATTEN(siteUrls.(site));

      2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Invalid alias: siteUrls::site in

      {url: chararray,pg: chararray,site: chararray}

      urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, pg:chararray),
      FLATTEN(siteUrls.(site,expanded_site)) AS (site:chararray,expanded_site:chararray);

      2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Invalid alias: siteUrls::site in

      {url: chararray,pg: chararray,site: chararray,expanded_site: chararray}

      Even if I do not use the AS correctly with FLATTEN, then all or none of the above should parse, so either way this is a parsing bug.

      Note that in the pig latin spec page, there is no formal description of FLATTEN operation and no example where it is used with GENERATE, AS and a bag of more than one tuples, so really I can't know if my above syntax is supported, but try and guess. Should I file a separate ticket on that?

      Attachments

        Activity

          People

            Unassigned Unassigned
            gmavromatis George Mavromatis
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: