Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4141

Hive Plugin : Timestamp value in an RCfile (Snappy compression) is wrongly interpreted

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Storage - Hive
    • Labels:
      None

      Description

      git.commit.id.abbrev=f7a0d38

      The below should have returned "1996-02-28 17:32:01.0". However it returns an incorrect value

      select timestamp_col from hive.fewtypes_null_compressed_rc_snappy where int_col=20;
      +------------------------+
      |     timestamp_col      |
      +------------------------+
      | 1996-02-29 01:32:01.0  |
      +------------------------+
      1 row selected (0.715 seconds)
      

      This is not related to the timezone of the sqlline client as the below query from the same client returns the proper result. However this time the data is stored in a sequence file with snappy compression

      select timestamp_col from hive.fewtypes_null_compressed_seq_snappy where int_col=20;
      +------------------------+
      |     timestamp_col      |
      +------------------------+
      | 1996-02-28 17:32:01.0  |
      +------------------------+
      

        Attachments

        1. fewtypes_null.tbl.gz
          0.9 kB
          Rahul Kumar Challapalli

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rkins Rahul Kumar Challapalli
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: