Description
I was using the new SQL option introduced by LIVY-19 and ran into what appears to be a limitation with DECIMAL support.
The set of queries I run roughly looks like this:
drop table if exists sales_fact_1998; create external table sales_fact_1998( product_id int,time_id int,customer_id int,promotion_id int, store_id int,store_sales decimal(10,4),store_cost decimal(10,4), unit_sales decimal(10,4)) stored as parquet location '${root}/sales_fact_1998'; select customer_id, sum(store_sales) - sum(store_cost) gross_profit from sales_fact_1998 where store_id=${store_id} and time_id=${time_id} group by customer_id order by gross_profit desc limit 10;
The full response I get from Livy is:
{"id":2,"code":"\n\nselect\n customer_id,\n sum(store_sales) - sum(store_cost) gross_profit\nfrom\n sales_fact_1998\nwhere\n store_id=12\n and time_id=1037\ngroup by\n customer_id\norder by\n gross_profit desc\nlimit 10","state":"available","output":{"status":"error","execution_count":2,"ename":"Error","evalue":"not a primitive class java.math.BigDecimal","traceback":["scala.sys.package$.error(package.scala:27)","org.apache.livy.shaded.json4s.Extraction$.writePrimitive(Extraction.scala:216)","org.apache.livy.shaded.json4s.Extraction$.internalDecomposeWithBuilder(Extraction.scala:127)","org.apache.livy.shaded.json4s.Extraction$.internalDecomposeWithBuilder(Extraction.scala:141)","org.apache.livy.shaded.json4s.Extraction$.internalDecomposeWithBuilder(Extraction.scala:151)","org.apache.livy.shaded.json4s.Extraction$.decomposeWithBuilder(Extraction.scala:67)","org.apache.livy.shaded.json4s.Extraction$.decompose(Extraction.scala:194)","org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:101)","org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)","org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)","scala.Option.map(Option.scala:146)","org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)","org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)","org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)","org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)","scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)","scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)","java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)","java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)","java.lang.Thread.run(Thread.java:748)"]},"progress":1.0}
My REST calls are:
POST {} to http://192.168.59.21:9999/sessions POST {"code": "drop table if exists sales_fact_1998", "kind": "sql"} to http://192.168.59.21:9999/sessions/1/statements POST {"code": "\ncreate external table sales_fact_1998(\n product_id int,time_id int,customer_id int,promotion_id int,\n store_id int,store_sales decimal(10,4),store_cost decimal(10,4),\n unit_sales decimal(10,4))\nstored as parquet\nlocation '/apps/hive/warehouse/foodmart_parquet.db/sales_fact_1998'", "kind": "sql"} to http://192.168.59.21:9999/sessions/1/statements POST {"code": "\n\nselect\n customer_id,\n sum(store_sales) - sum(store_cost) gross_profit\nfrom\n sales_fact_1998\nwhere\n store_id=12\n and time_id=1037\ngroup by\n customer_id\norder by\n gross_profit desc\nlimit 10", "kind": "sql"} to http://192.168.59.21:9999/sessions/1/statements
If I change the last statement to say cast(sum(store_sales) - sum(store_cost) as double) everything works.
If anyone wants the Parquet file, find it at https://github.com/cartershanklin/structor/tree/master/modules/sample_hive_data/files/foodmart_parquet.db/sales_fact_1998