Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1999

Drill should expose the Parquet logical schema rather than the physical schema

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.17.0
    • Component/s: Storage - Parquet
    • Labels:
      None

      Description

      Created a parquet file in hive having the following DDL
      hive> desc alltypesparquet;
      OK
      c1 int
      c2 boolean
      c3 double
      c4 string
      c5 array<int>
      c6 map<int,string>
      c7 map<string,string>
      c8 struct<r:string,s:int,t:double>
      c9 tinyint
      c10 smallint
      c11 float
      c12 bigint
      c13 array<array<string>>
      c15 struct<r:int,s:struct<a:int,b:string>>
      c16 array<struct<m:map<string,string>,n:int>>
      Time taken: 0.076 seconds, Fetched: 15 row(s)

      column5 which is an array of integers shows up as a bag when querying through drill
      0: jdbc:drill:> select c5 from `/user/hive/warehouse/alltypesparquet`;
      ------------

      c5

      ------------

      {"bag":[]}
      {"bag":[]}
      {"bag":[{"array_element":1},{"array_element":2}]}

      ------------
      3 rows selected (0.085 seconds)

      While from hive
      hive> select c5 from alltypesparquet;
      OK
      NULL
      NULL
      [1,2]

        Attachments

        1. hive_alltypes.parquet
          2 kB
          Ramana Inukonda Nagaraj

          Issue Links

            Activity

              People

              • Assignee:
                ihuzenko Igor Guzenko
                Reporter:
                inramana Ramana Inukonda Nagaraj
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: