Pig
  1. Pig
  2. PIG-3111

ToAvro to convert any Pig record to an Avro bytearray

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.12.0
    • Fix Version/s: 0.15.0
    • Component/s: data, internal-udfs
    • Labels:
      None

      Description

      I want to create a ToAvro() builtin that converts arbitrary pig fields, including complex types (bags, tuples, maps) to avro format as bytearrays.

      This would enable storing Avro records in arbitrary data stores, for example HBaseAvroStorage in PIG-2889

      See PIG-2641 for ToJson

      This points to a greater need for customizable/pluggable serialization that plugin to storefuncs and do serialization independently. For example, we might do these operations:

      a = load 'my_data' as (some_schema);
      b = foreach a generate ToJson;
      c = foreach a generate ToAvro;
      store b into 'hbase://JsonValueTable' using HBaseStorage(...);
      store c into 'hbase://AvroValueTable' using HBaseStorage(...);

      I'll make a ticket for pluggable serialization separately.

        Issue Links

          Activity

          Hide
          Russell Jurney added a comment -

          Waiting on commit of the avro rewrite.

          Show
          Russell Jurney added a comment - Waiting on commit of the avro rewrite.
          Hide
          Russell Jurney added a comment -

          Waiting for PIG-3015 to be completed.

          Show
          Russell Jurney added a comment - Waiting for PIG-3015 to be completed.

            People

            • Assignee:
              Russell Jurney
              Reporter:
              Russell Jurney
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Development