Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1999

Drill should expose the Parquet logical schema rather than the physical schema

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.17.0
    • Storage - Parquet
    • None

    Description

      Created a parquet file in hive having the following DDL
      hive> desc alltypesparquet;
      OK
      c1 int
      c2 boolean
      c3 double
      c4 string
      c5 array<int>
      c6 map<int,string>
      c7 map<string,string>
      c8 struct<r:string,s:int,t:double>
      c9 tinyint
      c10 smallint
      c11 float
      c12 bigint
      c13 array<array<string>>
      c15 struct<r:int,s:struct<a:int,b:string>>
      c16 array<struct<m:map<string,string>,n:int>>
      Time taken: 0.076 seconds, Fetched: 15 row(s)

      column5 which is an array of integers shows up as a bag when querying through drill
      0: jdbc:drill:> select c5 from `/user/hive/warehouse/alltypesparquet`;
      ------------

      c5

      ------------

      {"bag":[]}
      {"bag":[]}
      {"bag":[{"array_element":1},{"array_element":2}]}

      ------------
      3 rows selected (0.085 seconds)

      While from hive
      hive> select c5 from alltypesparquet;
      OK
      NULL
      NULL
      [1,2]

      Attachments

        1. hive_alltypes.parquet
          2 kB
          Ramana Inukonda Nagaraj

        Issue Links

          Activity

            People

              ihuzenko Igor Guzenko
              inramana Ramana Inukonda Nagaraj
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: