Uploaded image for project: 'Livy'
  1. Livy
  2. LIVY-455

Livy Spark SQL "not a primitive class java.math.BigDecimal" when querying decimal columns

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.5.1, 0.6.0
    • None
    • None

    Description

      I was using the new SQL option introduced by LIVY-19 and ran into what appears to be a limitation with DECIMAL support.

      The set of queries I run roughly looks like this:

      drop table if exists sales_fact_1998;
      create external table sales_fact_1998(
          product_id int,time_id int,customer_id int,promotion_id int,
          store_id int,store_sales decimal(10,4),store_cost decimal(10,4),
          unit_sales decimal(10,4))
      stored as parquet
      location '${root}/sales_fact_1998';
      
      select
          customer_id,
          sum(store_sales) - sum(store_cost) gross_profit
      from
          sales_fact_1998
      where
          store_id=${store_id}
          and time_id=${time_id}
      group by
          customer_id
      order by
          gross_profit desc
      limit 10;
      

      The full response I get from Livy is:

      {"id":2,"code":"\n\nselect\n    customer_id,\n    sum(store_sales) - sum(store_cost) gross_profit\nfrom\n    sales_fact_1998\nwhere\n    store_id=12\n    and time_id=1037\ngroup by\n    customer_id\norder by\n    gross_profit desc\nlimit 10","state":"available","output":{"status":"error","execution_count":2,"ename":"Error","evalue":"not a primitive class java.math.BigDecimal","traceback":["scala.sys.package$.error(package.scala:27)","org.apache.livy.shaded.json4s.Extraction$.writePrimitive(Extraction.scala:216)","org.apache.livy.shaded.json4s.Extraction$.internalDecomposeWithBuilder(Extraction.scala:127)","org.apache.livy.shaded.json4s.Extraction$.internalDecomposeWithBuilder(Extraction.scala:141)","org.apache.livy.shaded.json4s.Extraction$.internalDecomposeWithBuilder(Extraction.scala:151)","org.apache.livy.shaded.json4s.Extraction$.decomposeWithBuilder(Extraction.scala:67)","org.apache.livy.shaded.json4s.Extraction$.decompose(Extraction.scala:194)","org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:101)","org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)","org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)","scala.Option.map(Option.scala:146)","org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)","org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)","org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)","org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)","scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)","scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)","java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)","java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)","java.lang.Thread.run(Thread.java:748)"]},"progress":1.0}
      

      My REST calls are:

      POST {} to http://192.168.59.21:9999/sessions
      
      POST {"code": "drop table if exists sales_fact_1998", "kind": "sql"} to http://192.168.59.21:9999/sessions/1/statements
      
      POST {"code": "\ncreate external table sales_fact_1998(\n    product_id int,time_id int,customer_id int,promotion_id int,\n    store_id int,store_sales decimal(10,4),store_cost decimal(10,4),\n    unit_sales decimal(10,4))\nstored as parquet\nlocation '/apps/hive/warehouse/foodmart_parquet.db/sales_fact_1998'", "kind": "sql"} to http://192.168.59.21:9999/sessions/1/statements
      
      POST {"code": "\n\nselect\n    customer_id,\n    sum(store_sales) - sum(store_cost) gross_profit\nfrom\n    sales_fact_1998\nwhere\n    store_id=12\n    and time_id=1037\ngroup by\n    customer_id\norder by\n    gross_profit desc\nlimit 10", "kind": "sql"} to http://192.168.59.21:9999/sessions/1/statements
      

      If I change the last statement to say cast(sum(store_sales) - sum(store_cost) as double) everything works.

      If anyone wants the Parquet file, find it at https://github.com/cartershanklin/structor/tree/master/modules/sample_hive_data/files/foodmart_parquet.db/sales_fact_1998

      Attachments

        Activity

          People

            jerryshao Saisai Shao
            cartershanklin Carter Shanklin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: