Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3111

ToAvro to convert any Pig record to an Avro bytearray

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.12.0
    • 0.18.0
    • data, internal-udfs
    • None

    Description

      I want to create a ToAvro() builtin that converts arbitrary pig fields, including complex types (bags, tuples, maps) to avro format as bytearrays.

      This would enable storing Avro records in arbitrary data stores, for example HBaseAvroStorage in PIG-2889

      See PIG-2641 for ToJson

      This points to a greater need for customizable/pluggable serialization that plugin to storefuncs and do serialization independently. For example, we might do these operations:

      a = load 'my_data' as (some_schema);
      b = foreach a generate ToJson;
      c = foreach a generate ToAvro;
      store b into 'hbase://JsonValueTable' using HBaseStorage(...);
      store c into 'hbase://AvroValueTable' using HBaseStorage(...);

      I'll make a ticket for pluggable serialization separately.

      Attachments

        Issue Links

          Activity

            People

              russell.jurney Russell Jurney
              russell.jurney Russell Jurney
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: