Avro
  1. Avro
  2. AVRO-672

Convert JSON Text Input to Avro Tool

    Details

    • Type: New Feature New Feature
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: java
    • Labels:
      None

      Description

      The attached patch allows reading a JSON-formatted text file in, converting to a conforming Avro text file, emitting one record per line, e.g., it can read this input file:

      {"intval":12} {"intval":-73,"strval":"hello, there!!"}

      with this schema:
      { "type":"record", "name":"TestRecord", "fields": [

      {"name":"intval","type":"int"}

      ,

      {"name":"strval","type":["string", "null"]}

      ]}

      returning valid Avro. This is different than the DataFileWriteTool, which would read in the following internal encoding:

      {"intval":12,"strval":null}

      {"intval":-73,"strval":{"string":"hello, there!!"}}

      In general, the internal encodings used by Avro aren't natural when reading in JSON text that appears in the wild. Likewise, this utility allows changing invalid Avro identifier characters into an underscore, again to tolerate JSON that wasn't designed to be readable by Avro.

      1. AVRO-672.patch
        18 kB
        Ron Bodkin
      2. AVRO-672.patch
        9 kB
        Doug Cutting

        Activity

        Ron Bodkin created issue -
        Ron Bodkin made changes -
        Field Original Value New Value
        Attachment AVRO-672.patch [ 12455097 ]
        Ron Bodkin made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Doug Cutting made changes -
        Attachment AVRO-672.patch [ 12456440 ]
        Doug Cutting made changes -
        Component/s java [ 12312780 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Ron Bodkin
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development