Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1818

Parquet files generated by Drill ignore field names when nested elements are queried

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.7.0
    • Storage - Writer
    • None

    Description

      I observed this with this parquet file and a more comprehensive testing might be needed here. The issue is that Drill seem to simply ignore field names at the leaf level and accessing data in a positional fashion.

      Below is the repro.
      1. Generate a parquet file using Drill. Input is the JSON doc below

      create table dfs.tmp.sampleparquet as (select trans_id, cast(`date` as date) transdate,cast(`time` as time) transtime, cast(amount as double) amount,`user_info`,`marketing_info`, `trans_info` from dfs.`/Users/nrentachintala/Downloads/sample.json` )

      2. Now do queries.
      Note in query below, there is no field name called 'keywords' in trans_info, but data is just positionally returned (the data returned from prod_id column).
      0: jdbc:drill:zk=local> select t.`trans_info`.keywords from dfs.tmp.sampleparquet t where t.`trans_info`.keywords is not null;
      ------------

      EXPR$0

      ------------

      [16]
      []
      [293,90]
      [173,18,121,84,115,226,464,525,35,11,94,45]
      [311,29,5,41]

      0: jdbc:drill:zk=local> select t.`marketing_info`.keywords from dfs.tmp.sampleparquet t;

      Note in the query below, it is trying to return the first element in marketing_Info which is camp_id which is of int type for keywords columns. But keywords schema is string, so it throws error with type mismatch.

      Query failed: Query failed: Failure while running fragment., You tried to write a VarChar type when you are using a ValueWriter of type NullableBigIntWriterImpl. [ c3761403-b8c5-43c1-8e90-2c4918d1f85c on 10.0.0.20:31010 ]
      [ c3761403-b8c5-43c1-8e90-2c4918d1f85c on 10.0.0.20:31010 ]

      Error: exception while executing query: Failure while executing query. (state=,code=0)

      0: jdbc:drill:zk=local> select t.`marketing_info`.`camp_id`,t.`marketing_info`.keywords from dfs.tmp.sampleparquet t;
      ----------------------+

      EXPR$0 EXPR$1

      ----------------------+

      4 ["go","to","thing","watch","made","laughing","might","pay","in","your","hold"]
      6 ["pronounce","tree","instead","games","sigh"]
      17 []
      17 ["it's"]
      8 ["fallout"]

      ----------------------+

      Sample.json is below
      {"trans_id":0,"date":"2013-07-26","time":"04:56:59","amount":80.5,"user_info":

      {"cust_id":28,"device":"IOS5","state":"mt"}

      ,"marketing_info":

      {"camp_id":4,"keywords":["go","to","thing","watch","made","laughing","might","pay","in","your","hold"]}

      ,"trans_info":{"prod_id":[16],"purch_flag":"false"}}

      {"trans_id":1,"date":"2013-05-16","time":"07:31:54","amount":100.40,
      "user_info":

      {"cust_id":86623,"device":"AOS4.2","state":"mi"}

      ,"marketing_info":

      {"camp_id":6,"keywords":["pronounce","tree","instead","games","sigh"]}

      ,"trans_info":{"prod_id":[],"purch_flag":"false"}}

      {"trans_id":2,"date":"2013-06-09","time":"15:31:45","amount":20.25,
      "user_info":

      {"cust_id":11,"device":"IOS5","state":"la"}

      ,"marketing_info":

      {"camp_id":17,"keywords":[]}

      ,"trans_info":{"prod_id":[293,90],"purch_flag":"true"}}

      {"trans_id":3,"date":"2013-07-19","time":"11:24:22","amount":500.75,
      "user_info":

      {"cust_id":666,"device":"IOS5","state":"nj"}

      ,"marketing_info":

      {"camp_id":17,"keywords":["it's"]}

      ,"trans_info":{"prod_id":[173,18,121,84,115,226,464,525,35,11,94,45],"purch_flag":"false"}}

      {"trans_id":4,"date":"2013-07-21","time":"08:01:13","amount":34.20,"user_info":

      {"cust_id":999,"device":"IOS7","state":"ct"}

      ,"marketing_info":

      {"camp_id":8,"keywords":["fallout"]}

      ,"trans_info":{"prod_id":[311,29,5,41],"purch_flag":"false"}}

      Attachments

        1. 0_0_0.parquet
          2 kB
          Neeraja
        2. DRILL-1818.patch
          16 kB
          Steven Phillips

        Issue Links

          Activity

            People

              sphillips Steven Phillips
              Neeraja Neeraja
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: