Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2949

JsonLoader only reads arrays of objects

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.15.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      os x mountain lion

    • Patch Info:
      Patch Available
    • Hadoop Flags:
      Reviewed

      Description

      I'm trying to load a vendor file that's json ecoded into pig. One of the fields is an array of strings. The builtin JsonLoader only reads arrays composed of json objects

      {"object_array}

      :[

      {"element":"value1"}

      ,

      {"element":"value2"}

      ]} works
      but

      {"string_array"}

      :["value1","value2"] does not

      1. PIG-2949.patch
        10 kB
        Eyal Allweil
      2. JsonLoader.java
        13 kB
        David LaBarbera

        Activity

        Hide
        foodneutrino David LaBarbera added a comment -

        fix for reading json encoded string arrays

        Show
        foodneutrino David LaBarbera added a comment - fix for reading json encoded string arrays
        Hide
        foodneutrino David LaBarbera added a comment -

        the schema i tested with was something like
        words:

        {(word:chararray)}
        Show
        foodneutrino David LaBarbera added a comment - the schema i tested with was something like words: {(word:chararray)}
        Hide
        siva_2680 siva added a comment -

        Hi ,
        Please let me know anybody resolved with the following issue.

        Thanks
        Siva

        Show
        siva_2680 siva added a comment - Hi , Please let me know anybody resolved with the following issue. Thanks Siva
        Hide
        daijy Daniel Dai added a comment -

        Does the attached fix work for you? We can commit this patch if works.

        Show
        daijy Daniel Dai added a comment - Does the attached fix work for you? We can commit this patch if works.
        Hide
        foodneutrino David LaBarbera added a comment -

        I switched to elephant bird's loader which handles this case and more complex json objects.

        David

        Sent from my iPad

        Show
        foodneutrino David LaBarbera added a comment - I switched to elephant bird's loader which handles this case and more complex json objects. David Sent from my iPad
        Hide
        eyal Eyal Allweil added a comment -

        David's fix seems to work, but maybe we want to accept arrays of the general form

        {"arrayfield":[fld, fld, fld]}

        where fld is any field type, not just a string? Non-string "primitive" arrays are also valid json.

        Show
        eyal Eyal Allweil added a comment - David's fix seems to work, but maybe we want to accept arrays of the general form {"arrayfield":[fld, fld, fld]} where fld is any field type, not just a string? Non-string "primitive" arrays are also valid json.
        Hide
        eyal Eyal Allweil added a comment -

        This is a patch + unit test that is a modification of David's original patch. It will allow loading any array of primitives (any field other than bag, map or tuple) without repeating the object name, as this issue describes.

        For example [ 1,2,3 ] or [ "ab", "cd", "ef"]

        Show
        eyal Eyal Allweil added a comment - This is a patch + unit test that is a modification of David's original patch. It will allow loading any array of primitives (any field other than bag, map or tuple) without repeating the object name, as this issue describes. For example [ 1,2,3 ] or [ "ab", "cd", "ef"]
        Hide
        daijy Daniel Dai added a comment -

        Patch committed to trunk. Thanks Eyal!

        Show
        daijy Daniel Dai added a comment - Patch committed to trunk. Thanks Eyal!
        Hide
        eyal Eyal Allweil added a comment -

        Will there be a 0.13.1 or 0.14.1 release? If so, are bug fixes like this and the other json fixes (which are independent of any other changes, probably) candidates for inclusion?

        Show
        eyal Eyal Allweil added a comment - Will there be a 0.13.1 or 0.14.1 release? If so, are bug fixes like this and the other json fixes (which are independent of any other changes, probably) candidates for inclusion?
        Hide
        daijy Daniel Dai added a comment -

        There might be a 0.14.1 release, but I don't think there will be a 0.13.1 release. Usually we only include critical bug fixes in minor releases. New features only goes to major releases.

        Show
        daijy Daniel Dai added a comment - There might be a 0.14.1 release, but I don't think there will be a 0.13.1 release. Usually we only include critical bug fixes in minor releases. New features only goes to major releases.
        Hide
        rohini Rohini Palaniswamy added a comment -

        Sivashankar,
        Assignee field is for the person who is going to work on the issue and provide or has provided the fix for the issue.

        Show
        rohini Rohini Palaniswamy added a comment - Sivashankar , Assignee field is for the person who is going to work on the issue and provide or has provided the fix for the issue.
        Hide
        shravan.padakanti shravan kumar added a comment -

        What if we want to store array in the same format? When we store using JSONStorage(), I see it stores as col:["col1":"val1", "col2":val2]. How can we store col:["val1","val2"].

        Show
        shravan.padakanti shravan kumar added a comment - What if we want to store array in the same format? When we store using JSONStorage(), I see it stores as col: ["col1":"val1", "col2":val2] . How can we store col: ["val1","val2"] .

          People

          • Assignee:
            eyal Eyal Allweil
            Reporter:
            foodneutrino David LaBarbera
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development