Pig
  1. Pig
  2. PIG-1271

Provide a more flexible data format to load complex field (bag/tuple/map) in PigStorage

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.7.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      With PIG-613, we are able to load txt files containing complex data type (map/bag/tuple) according to schema. However, the format of complex data field is very strict. User have to use pre-determined special characters to mark the beginning and end of each field, and those special characters can not be used in the content. The goals of this issue are:

      1. Provide a way for user to escape special characters
      2. Make it easy for users to customize Utf8StorageConverter when they have their own data format

      This is a candidate project for Google summer of code 2013. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2013

        Activity

        Daniel Dai created issue -
        Daniel Dai made changes -
        Field Original Value New Value
        Fix Version/s 0.7.0 [ 12314397 ]
        Daniel Dai made changes -
        Fix Version/s 0.8.0 [ 12314562 ]
        Fix Version/s 0.7.0 [ 12314397 ]
        Olga Natkovich made changes -
        Fix Version/s 0.8.0 [ 12314562 ]
        Daniel Dai made changes -
        Labels gsoc2012
        Daniel Dai made changes -
        Description With [PIG-613|https://issues.apache.org/jira/browse/PIG-613], we are able to load txt files containing complex data type (map/bag/tuple) according to schema. However, the format of complex data field is very strict. User have to use pre-determined special characters to mark the beginning and end of each field, and those special characters can not be used in the content. The goals of this issue are:

        1. Provide a way for user to escape special characters
        2. Make it easy for users to customize Utf8StorageConverter when they have their own data format

        With [PIG-613|https://issues.apache.org/jira/browse/PIG-613], we are able to load txt files containing complex data type (map/bag/tuple) according to schema. However, the format of complex data field is very strict. User have to use pre-determined special characters to mark the beginning and end of each field, and those special characters can not be used in the content. The goals of this issue are:

        1. Provide a way for user to escape special characters
        2. Make it easy for users to customize Utf8StorageConverter when they have their own data format

        This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012
        Daniel Dai made changes -
        Labels gsoc2012 gsoc2013
        Daniel Dai made changes -
        Description With [PIG-613|https://issues.apache.org/jira/browse/PIG-613], we are able to load txt files containing complex data type (map/bag/tuple) according to schema. However, the format of complex data field is very strict. User have to use pre-determined special characters to mark the beginning and end of each field, and those special characters can not be used in the content. The goals of this issue are:

        1. Provide a way for user to escape special characters
        2. Make it easy for users to customize Utf8StorageConverter when they have their own data format

        This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012
        With [PIG-613|https://issues.apache.org/jira/browse/PIG-613], we are able to load txt files containing complex data type (map/bag/tuple) according to schema. However, the format of complex data field is very strict. User have to use pre-determined special characters to mark the beginning and end of each field, and those special characters can not be used in the content. The goals of this issue are:

        1. Provide a way for user to escape special characters
        2. Make it easy for users to customize Utf8StorageConverter when they have their own data format

        This is a candidate project for Google summer of code 2013. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2013

          People

          • Assignee:
            Daniel Dai
            Reporter:
            Daniel Dai
          • Votes:
            3 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development