Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8184

Add timestamp validation to Orc scanner

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • None
    • Backend
    • None

    Description

      Similarly to Parquet, Orc can also contain timestamps that are not valid in Impala, e.g. Hive can insert timestamps before 1400 while these are invalid in Impala. These invalid timestamps are often handled similarly to NULL, bur are actually not "real" NULLs, which can lead to some some weird behavior:

      Hive:
      create table orcts (ts timestamp) stored as orc;
      insert into orcts values ("1200-01-01");

      Impala:
      select * from orcts where ts is not null;
      Returns 1 row:
      NULL

      Attachments

        Activity

          People

            csringhofer Csaba Ringhofer
            csringhofer Csaba Ringhofer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: