Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6370

LazySimpleSerDe doesn't handle Date and Timestamp properly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.12.0
    • None
    • None

    Description

      LazySimpleSerde#serialize() calls LazyUtils.writePrimitiveUTF8() to handle primitive types.
      When writing out java.sql.Date, this in turn, calls LazyDate.writeUTF8() which calls DateWritable.toString(), which is effectively Date.toString().
      Date.toString() makes an implicit adjustment for the local timezone in it's output. Thus if Date.getTime() is on a day boundary (midnight UTC), toString() on it will write out the previous day. Date.valueOf() which is used by this SerDe to read data makes a similar adjustment for current timezone.

      This is wrong, it should write out Date.getTime() (possibly normalizing to day boundary). This will make read/write independent of current timezone.

      I think java.sql.Timestamp has similar issue. When this is fixed, work in HIVE-5814 should be adjusted to work with getTime() rather than use deprecated day/month/year API it uses now.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated: