Pig
  1. Pig
  2. PIG-2949

JsonLoader only reads arrays of objects

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.15.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      os x mountain lion

    • Patch Info:
      Patch Available
    • Hadoop Flags:
      Reviewed

      Description

      I'm trying to load a vendor file that's json ecoded into pig. One of the fields is an array of strings. The builtin JsonLoader only reads arrays composed of json objects

      {"object_array}

      :[

      {"element":"value1"}

      ,

      {"element":"value2"}

      ]} works
      but

      {"string_array"}

      :["value1","value2"] does not

      1. PIG-2949.patch
        10 kB
        Eyal Allweil
      2. JsonLoader.java
        13 kB
        David LaBarbera

        Activity

        David LaBarbera created issue -
        Hide
        David LaBarbera added a comment -

        fix for reading json encoded string arrays

        Show
        David LaBarbera added a comment - fix for reading json encoded string arrays
        David LaBarbera made changes -
        Field Original Value New Value
        Attachment JsonLoader.java [ 12548028 ]
        Hide
        David LaBarbera added a comment -

        the schema i tested with was something like
        words:

        {(word:chararray)}
        Show
        David LaBarbera added a comment - the schema i tested with was something like words: {(word:chararray)}
        Hide
        siva added a comment -

        Hi ,
        Please let me know anybody resolved with the following issue.

        Thanks
        Siva

        Show
        siva added a comment - Hi , Please let me know anybody resolved with the following issue. Thanks Siva
        Hide
        Daniel Dai added a comment -

        Does the attached fix work for you? We can commit this patch if works.

        Show
        Daniel Dai added a comment - Does the attached fix work for you? We can commit this patch if works.
        Hide
        David LaBarbera added a comment -

        I switched to elephant bird's loader which handles this case and more complex json objects.

        David

        Sent from my iPad

        Show
        David LaBarbera added a comment - I switched to elephant bird's loader which handles this case and more complex json objects. David Sent from my iPad
        Hide
        Eyal Allweil added a comment -

        David's fix seems to work, but maybe we want to accept arrays of the general form

        {"arrayfield":[fld, fld, fld]}

        where fld is any field type, not just a string? Non-string "primitive" arrays are also valid json.

        Show
        Eyal Allweil added a comment - David's fix seems to work, but maybe we want to accept arrays of the general form {"arrayfield":[fld, fld, fld]} where fld is any field type, not just a string? Non-string "primitive" arrays are also valid json.
        Eyal Allweil made changes -
        Assignee Eyal Allweil [ eyal ]
        Hide
        Eyal Allweil added a comment -

        This is a patch + unit test that is a modification of David's original patch. It will allow loading any array of primitives (any field other than bag, map or tuple) without repeating the object name, as this issue describes.

        For example [ 1,2,3 ] or [ "ab", "cd", "ef"]

        Show
        Eyal Allweil added a comment - This is a patch + unit test that is a modification of David's original patch. It will allow loading any array of primitives (any field other than bag, map or tuple) without repeating the object name, as this issue describes. For example [ 1,2,3 ] or [ "ab", "cd", "ef"]
        Eyal Allweil made changes -
        Attachment PIG-2949.patch [ 12684077 ]
        Eyal Allweil made changes -
        Patch Info Patch Available [ 10042 ]
        Hide
        Daniel Dai added a comment -

        Patch committed to trunk. Thanks Eyal!

        Show
        Daniel Dai added a comment - Patch committed to trunk. Thanks Eyal!
        Daniel Dai made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Fix Version/s 0.15.0 [ 12328760 ]
        Resolution Fixed [ 1 ]
        Hide
        Eyal Allweil added a comment -

        Will there be a 0.13.1 or 0.14.1 release? If so, are bug fixes like this and the other json fixes (which are independent of any other changes, probably) candidates for inclusion?

        Show
        Eyal Allweil added a comment - Will there be a 0.13.1 or 0.14.1 release? If so, are bug fixes like this and the other json fixes (which are independent of any other changes, probably) candidates for inclusion?
        Hide
        Daniel Dai added a comment -

        There might be a 0.14.1 release, but I don't think there will be a 0.13.1 release. Usually we only include critical bug fixes in minor releases. New features only goes to major releases.

        Show
        Daniel Dai added a comment - There might be a 0.14.1 release, but I don't think there will be a 0.13.1 release. Usually we only include critical bug fixes in minor releases. New features only goes to major releases.
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        822d 6h 41m 1 Daniel Dai 06/Jan/15 01:52

          People

          • Assignee:
            Eyal Allweil
            Reporter:
            David LaBarbera
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development