Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3962

Unclear error messages for map type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Here is a correct script:

      user_hashes = LOAD 'users.tsv' AS (info:map[], fullref:chararray, shortref:chararray);
      -- [userid#123,username#jerry,followers#192]	name	username
      -- [userid#987,username#george,followers#31]	id	userid
      -- [userid#568,username#elaine,followers#40]	name	followers
      
      users = FOREACH user_hashes GENERATE
        info#'userid'   AS userid:chararray,
        info#'username' AS username:chararray;
      
      DUMP users;
      -- (123,jerry)
      -- (987,george)
      -- (568,elaine)
      

      Omitting the quotes on the key dereference gives a very unhelpful error message.

      users = FOREACH user_hashes GENERATE info#userid AS userid:chararray;
      
      -- 400   ERROR: ERROR 1200: <file ./foo.pig, line 8, column 42>  [...] mismatched input 'userid' expecting set null
      

      It may be that the user forgot the quotes, or may instead be assuming that Pig allows dereferencing a map by the value of an alias or expression:

      users = FOREACH user_hashes GENERATE
        info#'username',               -- works
        info#username,                 -- need quotes around literal
        info#fullref,                  -- no, can't use an alias' value to deref
        info#(CONCAT('user',shortref)) -- and can't use an expression to deref
        ;
      

      The error would be better off reading

      Values may only be retrieved from a map by using a literal chararray key. Did you mean << info#'userid' >>? See http://pig.apache.org/docs/r0.12.0/basic.html#map

      ----------------------

      Forgetting to attach the dummy square brackets on a map schema gives another confusing error message:

      user_hashes = LOAD 'users.tsv' AS (id:chararray, profile:map);
      
      -- 390   ERROR: ERROR 1200: <file ./foo.pig, line 1, column 60> [...] mismatched input ')' expecting LEFT_BRACKET
      

      This message should read something like

      A map type must supply (within square brackets) the schema for its values; or supply empty square brackets if it holds generic values. Did you mean << profile:map[chararray] >> or << profile:map[] >>? See http://pig.apache.org/docs/r0.12.0/basic.html#map-schema

      Attachments

        Activity

          People

            Unassigned Unassigned
            mrflip Flip Kromer
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: