Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-2950

LocalDateTime-millis and -micros is bound to lead to wrong data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Won't Fix
    • 1.10.0
    • None
    • logical types
    • None

    Description

      The recent addition of LocalDateTime Logical Types I find it extremely dangerous. It will lead to wrong data for many users without noticing.

      While I understand the idea and reason, the oversight in my opinion is the the difference between Hadoop Files and Avro Messages: Hadoop is for data storage, Avro is for data exchange. Hadoop runs in a single cluster and it has a well defined time zone. Thus LocalDateTime does have a meaning. Avro is used to exchange data between systems. Serializing data on a system in time zone 1 and loading it into the Hadoop cluster located in time zone 2 will lead to wrong data with an high likelihood.

      Example: Kafka Connect Producer is running in US (PST) and Hadoop in UK (GMT).

      User 1 expectation: In Hadoop the data is in LocalDateTime meaning in GMT. The Java data types Date, java.sql.Timestamp and LocalDateTime are used, which all are data types without a time zone information. Thus they return correct data if the loaded data has the meaning of UK-time. The Kafka Producer does not know the time zone of Hadoop.

      User 2 expectation: In Hadoop the data belongs to an office and has an implicit time zone hence. It is the time zone of the office location. In that case a LocalDateTime is meant as the time as seen on the office clock.

      As these two cases cannot be distinguished from each other and people tend to think locally, we are inviting people to produce wrong data.

       

      The better logical type would have been for the Java ZonedDateTime. Then the producer and consumer are in sync. The producer is loading data in PST time zone and the consumer can read the data as GMT times. If he wants the local office times, he has to add the office timezone offsets.

      LocalDateTime was introduced here: https://issues.apache.org/jira/browse/AVRO-2328

       

      Can you please open the discussion on this item to make sure you are fully aware of the implications and still want to go with it?

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            wdaehn Werner Daehn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: