Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4342

Drill fails to read a date column from hive generated parquet

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None

    Description

      git.commit.id.abbrev=576271d

      Below is the hive ddl (using hive 1.2 which supports date in parquet)

      create external table hive1dot2_fewtypes_null (
            int_col int,
            bigint_col bigint,
            date_col date,
            time_col string,
            timestamp_col timestamp,
            interval_col string,
            varchar_col string,
            float_col float,
            double_col double,
            bool_col boolean
          )
      stored as parquet
      location '/drill/testdata/hive_storage/hive1dot2_fewtypes_null';
      

      Query using the hive storage plugin

      date_col from hive.hive1dot2_fewtypes_null;
      +-------------+
      |  date_col   |
      +-------------+
      | null        |
      | null        |
      | null        |
      | 1996-01-29  |
      | 1996-03-01  |
      | 1996-03-02  |
      | 1997-02-28  |
      | null        |
      | 1997-03-01  |
      | 1997-03-02  |
      | 2000-04-01  |
      | 2000-04-03  |
      | 2038-04-08  |
      | 2039-04-09  |
      | 2040-04-10  |
      | null        |
      | 1999-02-08  |
      | 1999-03-08  |
      | 1999-01-18  |
      | 2003-01-02  |
      | null        |
      +-------------+
      

      Below is the output reading through dfs parquet reader.

      0: jdbc:drill:zk=10.10.10.41:5181> select date_col from dfs.`/drill/testdata/hive_storage/hive1dot2_fewtypes_null`;
      +-------------+
      |  date_col   |
      +-------------+
      | null        |
      | null        |
      | null        |
      | 369-02-09  |
      | 369-03-12  |
      | 369-03-13  |
      | 368-03-11  |
      | null        |
      | 368-03-12  |
      | 368-03-13  |
      | 365-04-12  |
      | 365-04-14  |
      | 327-04-19  |
      | 326-04-20  |
      | 325-04-21  |
      | null        |
      | 366-02-19  |
      | 366-03-19  |
      | 366-01-29  |
      | 362-01-13  |
      | null        |
      +-------------+
      

      I attached the parquet file generated from hive. Let me know if anything else is needed for reproducing this issue

      Attachments

        1. fewtypes_null.parquet
          3 kB
          Rahul Kumar Challapalli

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rkins Rahul Kumar Challapalli
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: