Description
The data file writer in the C library can produce corrupt blocks. The logic in datafile.c is that we have a fixed-buffer in-memory avro_writer_t instance. When you append records to the data file, they go into this memory buffer. If we get an error serializing into the memory buffer, it's presumably because we've filled it, so we write out the memory buffer's contents as a new block in the file, clear the buffer, and try again.
The problem is that the failed serialization into the memory buffer isn't atomic; some of the serialization will have made it into the buffer before we discover that there's not enough room. And this incomplete record will then make it into the file.