Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5845

[Java] Implement converter between Arrow record batches and Avro records

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1.0.0
    • Component/s: Java
    • Labels:
      None

      Description

      It would be useful for applications which need convert Avro data to Arrow data.

      This is an adapter which convert data with existing API (like JDBC adapter) rather than a native reader (like orc).

      We implement this function through Avro java project, receiving param like Decoder/Schema/DatumReader of Avro and return VectorSchemaRoot. For each data type we have a consumer class as below to get Avro data and write it into vector to avoid boxing/unboxing (e.g. GenericRecord#get returns Object)

      public class AvroIntConsumer implements Consumer {
      private final IntWriter writer;
      
      public AvroIntConsumer(IntVector vector)
      
      { this.writer = new IntWriterImpl(vector); }
      
      @Override
      public void consume(Decoder decoder) throws IOException
      
      { writer.writeInt(decoder.readInt()); writer.setPosition(writer.getPosition() + 1); }
      

      We intended to support primitive and complex types (null value represented via unions type with null type), size limit and field selection could be optional for users. 

        Attachments

          Activity

            People

            • Assignee:
              tianchen92 Ji Liu
              Reporter:
              tianchen92 Ji Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 61h 40m
                61h 40m