Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3101

Primitive number values are silently truncated in Java GenericDatumWriter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.10.0, 1.10.1, 1.10.2
    • None
    • java
    • None

    Description

      Primitive java numeric types are silently truncated in GenericDatumWriter.

      Previously (1.9.2) a Type.LONG field with a double value set would cause a ClassCastException when serializing the datum.

      Changes in AVRO-2070 cause a double value to be silently truncated.

      I don't know if this is a bug or expected behavior since in 1.9.2 (and way way earlier) Type.INT would be silently truncated but other numerics would not.

      My use-case involves users generating data which conforms to a dynamically generated Avro schema. The current change provides type safety (for downstream consumers) but does not maintain data integrity. From my POV it would be better to users to explicitly error with a ClassCastException than to introduce corrupt data.

      Example test case, which throws ClassCastException in 1.9.2 and prints 456 (not the value set) in 1.10.2. 

      @Test
      fun testWritingDoubleToLong() {
       val longType = Schema.create(Schema.Type.LONG)
       val field = Schema.Field("long", longType)
       val fields = listOf(field)
       val schema = Schema.createRecord("test", "doc", "", false, fields)
       val record: GenericRecord = GenericData.Record(schema)
       record.put("long", 456.4)
      
       val stream = ByteArrayOutputStream()
       val datumWriter: DatumWriter<GenericRecord> = GenericDatumWriter(schema)
       val encoder = EncoderFactory.get().binaryEncoder(stream, null)
       datumWriter.write(record, encoder)
       encoder.flush()
       val decoder = DecoderFactory.get().binaryDecoder(stream.toByteArray(), null)
       val datumReader: DatumReader<GenericRecord> = GenericDatumReader(schema)
       val output = datumReader.read(null, decoder)
       println(output["long"])
      }

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            jclarke James Clarke
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: