Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4392

CTAS with partition writes an internal field into generated parquet files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 1.6.0
    • None
    • None

    Description

      On today's master branch:

      select * from sys.version;
      +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
      |     version     |                 commit_id                 |                           commit_message                            |        commit_time         |   build_email   |         build_time         |
      +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
      | 1.5.0-SNAPSHOT  | 9a3a5c4ff670a50a49f61f97dd838da59a12f976  | DRILL-4382: Remove dependency on drill-logical from vector package  | 16.02.2016 @ 11:58:48 PST  | jni@apache.org  | 16.02.2016 @ 17:40:44 PST  |
      +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------
      

      Parquet table created by Drill's CTAS statement has one internal field "P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R". This additional field would not impact non-star query, but would cause incorrect result for star query.

      use dfs.tmp;
      
      create table nation_ctas partition by (n_regionkey) as select * from cp.`tpch/nation.parquet`;
      
      select * from dfs.tmp.nation_ctas limit 6;
      +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
      | n_nationkey  |     n_name     | n_regionkey  |                                                    n_comment                                                    | P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R  |
      +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
      | 5            | ETHIOPIA       | 0            | ven packages wake quickly. regu                                                                                 | true                                   |
      | 15           | MOROCCO        | 0            | rns. blithely bold courts among the closely regular packages use furiously bold platelets?                      | false                                  |
      | 14           | KENYA          | 0            |  pending excuses haggle furiously deposits. pending, express pinto beans wake fluffily past t                   | false                                  |
      | 0            | ALGERIA        | 0            |  haggle. carefully final deposits detect slyly agai                                                             | false                                  |
      | 16           | MOZAMBIQUE     | 0            | s. ironic, unusual asymptotes wake blithely r                                                                   | false                                  |
      | 24           | UNITED STATES  | 1            | y final packages. slow foxes cajole quickly. quickly silent platelets breach ironic accounts. unusual pinto be  | true
      

      This basically breaks all the parquet files created by Drill's CTAS with partition support.

      Also, it will also fail one of the Pre-commit functional test [1]

      [1] https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/ctas/ctas_auto_partition/general/data/drill3361.q

      Attachments

        Activity

          People

            jni Jinfeng Ni
            jni Jinfeng Ni
            Khurram Faraaz Khurram Faraaz
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: