Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4141

Hive Plugin : Timestamp value in an RCfile (Snappy compression) is wrongly interpreted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • Storage - Hive
    • None

    Description

      git.commit.id.abbrev=f7a0d38

      The below should have returned "1996-02-28 17:32:01.0". However it returns an incorrect value

      select timestamp_col from hive.fewtypes_null_compressed_rc_snappy where int_col=20;
      +------------------------+
      |     timestamp_col      |
      +------------------------+
      | 1996-02-29 01:32:01.0  |
      +------------------------+
      1 row selected (0.715 seconds)
      

      This is not related to the timezone of the sqlline client as the below query from the same client returns the proper result. However this time the data is stored in a sequence file with snappy compression

      select timestamp_col from hive.fewtypes_null_compressed_seq_snappy where int_col=20;
      +------------------------+
      |     timestamp_col      |
      +------------------------+
      | 1996-02-28 17:32:01.0  |
      +------------------------+
      

      Attachments

        1. fewtypes_null.tbl.gz
          0.9 kB
          Rahul Kumar Challapalli

        Activity

          People

            Unassigned Unassigned
            rkins Rahul Kumar Challapalli
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: