Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22967

VersionSuite failed on Windows caused by Windows format path

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.2.1
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
    • Environment:

      Windos7

    • Flags:
      Patch

      Description

      On Windows system, two unit test case would fail while running VersionSuite ("A simple set of tests that call the methods of a `HiveClient`, loading different version of hive from maven central.")

      Failed A : test(s"$version: read avro file containing decimal")

      org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
      

      Failed B: test(s"$version: SPARK-17920: Insert into/overwrite avro table")

      Unable to infer the schema. The schema specification is required to create the table `default`.`tab2`.;
      org.apache.spark.sql.AnalysisException: Unable to infer the schema. The schema specification is required to create the table `default`.`tab2`.;
      

      As I deep into this problem, I found it is related to ParserUtils#unescapeSQLString().

      These are two lines at the beginning of Failed A:

      val url = Thread.currentThread().getContextClassLoader.getResource("avroDecimal")
      val location = new File(url.getFile)
      

      And in my environment´╝î`location` (path value) is

      D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
      

      And then, in SparkSqlParser#visitCreateHiveTable()#L1128:

      val location = Option(ctx.locationSpec).map(visitLocationSpec)
      

      This line want to get LocationSepcContext's content first, which is equal to `location` above.
      Then, the content is passed to visitLocationSpec(), and passed to unescapeSQLString()
      finally.

      Lets' have a look at unescapeSQLString():

      /** Unescape baskslash-escaped string enclosed by quotes. */
        def unescapeSQLString(b: String): String = {
          var enclosure: Character = null
          val sb = new StringBuilder(b.length())
      
          def appendEscapedChar(n: Char) {
            n match {
              case '0' => sb.append('\u0000')
              case '\'' => sb.append('\'')
              case '"' => sb.append('\"')
              case 'b' => sb.append('\b')
              case 'n' => sb.append('\n')
              case 'r' => sb.append('\r')
              case 't' => sb.append('\t')
              case 'Z' => sb.append('\u001A')
              case '\\' => sb.append('\\')
              // The following 2 lines are exactly what MySQL does TODO: why do we do this?
              case '%' => sb.append("\\%")
              case '_' => sb.append("\\_")
              case _ => sb.append(n)
            }
          }
      
          var i = 0
          val strLength = b.length
          while (i < strLength) {
            val currentChar = b.charAt(i)
            if (enclosure == null) {
              if (currentChar == '\'' || currentChar == '\"') {
                enclosure = currentChar
              }
            } else if (enclosure == currentChar) {
              enclosure = null
            } else if (currentChar == '\\') {
      
              if ((i + 6 < strLength) && b.charAt(i + 1) == 'u') {
                // \u0000 style character literals.
      
                val base = i + 2
                val code = (0 until 4).foldLeft(0) { (mid, j) =>
                  val digit = Character.digit(b.charAt(j + base), 16)
                  (mid << 4) + digit
                }
                sb.append(code.asInstanceOf[Char])
                i += 5
              } else if (i + 4 < strLength) {
                // \000 style character literals.
      
                val i1 = b.charAt(i + 1)
                val i2 = b.charAt(i + 2)
                val i3 = b.charAt(i + 3)
      
                if ((i1 >= '0' && i1 <= '1') && (i2 >= '0' && i2 <= '7') && (i3 >= '0' && i3 <= '7')) {
                  val tmp = ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') << 6)).asInstanceOf[Char]
                  sb.append(tmp)
                  i += 3
                } else {
                  appendEscapedChar(i1)
                  i += 1
                }
              } else if (i + 2 < strLength) {
                // escaped character literals.
                val n = b.charAt(i + 1)
                appendEscapedChar(n)
                i += 1
              }
            } else {
              // non-escaped character literals.
              sb.append(currentChar)
            }
            i += 1
          }
          sb.toString()
        }
      

      Again, here, variable `b` is equal to content and `location`, is valued of

      D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
      

      And we can make sense from the unescapeSQLString()' strategies that it transform the String "\t" into a escape character '\t' and remove all backslashes.
      So, our original correct location resulted in:

      D:workspaceIdeaProjectssparksqlhive\targetscala-2.11\test-classesavroDecimal
      

      after unescapeSQLString() completed.
      Note that, here, [ \t ] is no longer a string, but a escape character.

      Then, return into SparkSqlParser#visitCreateHiveTable(), and move to L1134:

      val locUri = location.map(CatalogUtils.stringToURI(_))
      

      `location` is passed to stringToURI(), and resulted in:

      file:/D:workspaceIdeaProjectssparksqlhive%09argetscala-2.11%09est-classesavroDecimal
      

      finally, as escape character '\t' is transformed into URI code '%09'.

      Although, I'm not clearly about how this wrong path directly caused that exception, as I almostly know nothing about Hive, I can verify that this wrong path is the real factor to cause this exception.

      When I append these lines(in order to fix the wrong path) after HiveExternalCatalog#doCreateTable()Line236-240:

      if (tableLocation.get.getPath.startsWith("/D")) {
           tableLocation = Some(CatalogUtils.stringToURI(
              "file:/D:/workspace/IdeaProjects/spark/sql/hive/target/scala-2.11/test-classes/avroDecimal"))
          }
      

      then, failed unit test A will pass, excluding test B.

      And below is the stack trace of the Exception:

      org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string)
      	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:469)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:273)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:210)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:209)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:256)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:467)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:263)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
      	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
      	at org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
      	at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:119)
      	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:304)
      	at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:128)
      	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
      	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
      	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
      	at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
      	at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
      	at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3196)
      	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
      	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3195)
      	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
      	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:71)
      	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
      	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
      	at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24$$anonfun$apply$mcV$sp$3.apply$mcV$sp(VersionsSuite.scala:829)
      	at org.apache.spark.sql.hive.client.VersionsSuite.withTable(VersionsSuite.scala:70)
      	at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply$mcV$sp(VersionsSuite.scala:828)
      	at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
      	at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
      	at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      	at org.scalatest.Transformer.apply(Transformer.scala:22)
      	at org.scalatest.Transformer.apply(Transformer.scala:20)
      	at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
      	at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
      	at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183)
      	at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
      	at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
      	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
      	at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196)
      	at org.scalatest.FunSuite.runTest(FunSuite.scala:1560)
      	at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
      	at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
      	at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
      	at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
      	at scala.collection.immutable.List.foreach(List.scala:381)
      	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
      	at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
      	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
      	at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229)
      	at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
      	at org.scalatest.Suite$class.run(Suite.scala:1147)
      	at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
      	at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
      	at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
      	at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
      	at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233)
      	at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31)
      	at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213)
      	at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
      	at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
      	at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
      	at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
      	at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1334)
      	at scala.collection.immutable.List.foreach(List.scala:381)
      	at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1334)
      	at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1011)
      	at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1010)
      	at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1500)
      	at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
      	at org.scalatest.tools.Runner$.run(Runner.scala:850)
      	at org.scalatest.tools.Runner.run(Runner.scala)
      	at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:138)
      	at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:28)
      Caused by: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string)
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1121)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
      	at com.sun.proxy.$Proxy31.create_table_with_environment_context(Unknown Source)
      	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482)
      	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
      	at com.sun.proxy.$Proxy32.createTable(Unknown Source)
      	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
      	... 78 more
      Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string
      	at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
      	at org.apache.hadoop.fs.Path.<init>(Path.java:184)
      	at org.apache.hadoop.fs.Path.getParent(Path.java:357)
      	at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:427)
      	at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:690)
      	at org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:194)
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1059)
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1107)
      	... 93 more
      
      

      As for test B, I did'n do a careful inspection, but I find a same wrong path as test A. So, I guess exceptions were caused by the same factor.

        Attachments

          Activity

            People

            • Assignee:
              Ngone51 wuyi
              Reporter:
              Ngone51 wuyi
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved: