Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-5026

Cannot create hive table with Spark SQL

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 0.9.0
    • None
    • Interpreters, spark

    Description

      Hello Zeppelin Team,

      I am trying to use Spark on Zeppelin after using Pyspark on Jupyter  for a long time.

      • Hive version is 2.3.7.
      • Spark version is 2.4.6 with Scala 2.11.
      • Hive metastore uses PostgreSQL.

      At the present, I can use SELECT, JOIN without any errors. However, when I execute CREATE TABLE, I get an error "MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)"

      SQL:

       

      CREATE TABLE default.simpletest ( a int, b string )
      
      

       

      My Spark interpreter config:

       

      Name Value
      master yarn-client
      spark.jars.packages  org.apache.spark:spark-hive_2.11:2.4.6,org.apache.spark:spark-sql_2.11:2.4.6,org.postgresql:postgresql:42.2.16 ​
      zeppelin.spark.useHiveContext true
      spark.sql.warehouse.dir hdfs://node1:9000/user/hive/warehouse/

       

       Full error logs:

      java.lang.reflect.InvocationTargetException
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.zeppelin.spark.SparkSqlInterpreter.internalInterpret(SparkSqlInterpreter.java:106)
      	at org.apache.zeppelin.interpreter.AbstractInterpreter.interpret(AbstractInterpreter.java:47)
      	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:110)
      	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:684)
      	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:577)
      	at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
      	at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
      	at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:39)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.);
      	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
      	at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:236)
      	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
      	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:324)
      	at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:130)
      	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
      	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
      	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
      	at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
      	at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
      	at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
      	at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
      	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
      	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
      	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
      	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)
      	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
      	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
      	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
      	... 15 more
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)
      	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:720)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:484)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:482)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:482)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:277)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:215)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:214)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:260)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:482)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:278)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:236)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:236)
      	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
      	... 33 more
      Caused by: MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)
      	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.reconnect(HiveMetaStoreClient.java:308)
      	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:148)
      	at com.sun.proxy.$Proxy34.createTable(Unknown Source)
      	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714)
      	... 45 more

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            akizminet NGUYEN VAN PHAM

            Dates

              Created:
              Updated:

              Slack

                Issue deployment