Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19230

View creation in Derby gets SQLDataException because definition gets very big

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.1.0
    • None
    • SQL

    Description

      somewhat related to SPARK-6024.
      In our tests mockups we have a process that creates a pretty big table definition:

      create table t1 (
      field_name_1 string,
      field_name_2 string,
      field_name_3 string,
      .
      .
      .
      field_name_1000 string
      )

      which succeeds. But then we add some calculated fields on top of it with a view:

      create view v1 as
      select *,
      some_udf(field_name_1) as field_calc1,
      some_udf(field_name_2) as field_calc2,
      .
      .
      some_udf(field_name_10) as field_calc10
      from t1

      And we get this exception:

      java.sql.SQLDataException: A truncation error was encountered trying to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `field_name_1`, `gen_attr_1` AS `field_name_2&' to length 32700.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source)
      at com.jolbox.bonecp.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:205)
      at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeUpdate(ParamLoggingPreparedStatement.java:399)
      at org.datanucleus.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:439)
      at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:410)
      at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertTable(RDBMSPersistenceHandler.java:167)
      at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:143)
      at org.datanucleus.state.JDOStateManager.internalMakePersistent(JDOStateManager.java:3784)
      at org.datanucleus.state.JDOStateManager.makePersistent(JDOStateManager.java:3760)
      at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2219)
      at org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065)
      at org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913)
      at org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
      at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727)
      at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)
      at org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:814)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
      at com.sun.proxy.$Proxy17.createTable(Unknown Source)
      at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1416)
      at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1449)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
      at com.sun.proxy.$Proxy19.create_table_with_environment_context(Unknown Source)
      at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2050)
      at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:97)
      at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:669)
      at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
      at com.sun.proxy.$Proxy20.createTable(Unknown Source)
      at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714)
      at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:425)
      at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:425)
      at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:425)
      at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:283)
      at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:230)
      at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:229)
      at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:272)
      at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:424)
      at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:203)
      at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:191)
      at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:191)
      at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:95)
      at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:191)
      at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:248)
      at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:176)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
      at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
      at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
      at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
      at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
      at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
      at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
      at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
      at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
      at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
      at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
      at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.SparkHcatUtils.execute(SparkHcatUtils.java:24)
      at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.executeHcatSql(HcatUtils.java:48)
      at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.executeHcatSql(HcatUtils.java:38)
      at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.handleHcatTableView(HcatUtils.java:110)
      at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.handleHcatTableView(HcatUtils.java:92)
      at com.paypal.risk.ars.bigdata.test.FlowTestUtils.prepareHcatTable(FlowTestUtils.java:77)
      at com.paypal.risk.ars.bigdata.test.TestPigBase.innerSetup(TestPigBase.java:104)
      at com.paypal.risk.ars.bigdata.hadoop.spark.TestSparkBase.innerSetup(TestSparkBase.scala:65)
      at com.paypal.risk.ars.bigdata.test.AbstractDataDrivenUnitTest.setup(AbstractDataDrivenUnitTest.java:73)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
      at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
      at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
      at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
      at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
      at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
      at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
      at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
      at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
      at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
      at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
      at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
      at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
      at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
      at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
      Caused by: java.sql.SQLException: A truncation error was encountered trying to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `field_name_1`, `gen_attr_1` AS `field_name_2&' to length 32700.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
      ... 118 more
      Caused by: ERROR 22001: A truncation error was encountered trying to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `cust_id`, `gen_attr_1` AS `txn_lmt_a&' to length 32700.
      at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
      at org.apache.derby.iapi.types.SQLLongvarchar.normalize(Unknown Source)
      at org.apache.derby.iapi.types.SQLVarchar.normalize(Unknown Source)
      at org.apache.derby.iapi.types.DataTypeDescriptor.normalize(Unknown Source)
      at org.apache.derby.impl.sql.execute.NormalizeResultSet.normalizeColumn(Unknown Source)
      at org.apache.derby.impl.sql.execute.NormalizeResultSet.normalizeRow(Unknown Source)
      at org.apache.derby.impl.sql.execute.NormalizeResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.InsertResultSet.open(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
      ... 112 more

      trying to debug, it looks like the view definition gets multiplied about 3 times:

      SELECT `gen_attr_0` as `field_name_1`,...`gen_attr_1000` as `field_name_1000` FROM (
      SELECT `gen_attr_0`, `gen_attr_1`,..,`gen_attr_1000` FROM (
      SELECT `field_name_1` AS `gen_attr_0`,...,`field_name_1000` AS `gen_attr_1000`) as gen_subquery_0 ) as t1

      Is there any known work-around?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              uzadude Ohad Raviv
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: