Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47361 Improve JDBC data sources
  3. SPARK-46612

Clickhouse's JDBC throws `java.lang.IllegalArgumentException: Unknown data type: string` when write array string with Apache Spark scala

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0
    • 4.0.0
    • SQL

    Description

      Issue is also reported on Clickhouse's github: https://github.com/ClickHouse/clickhouse-java/issues/1505 

      Bug description

      When using Scala spark to write an array of string to Clickhouse, the driver throws java.lang.IllegalArgumentException: Unknown data type: string exception.

      Exception is thrown by: https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238

      This was caused by Spark JDBC Utils tried to cast the type to lower case (String -> string).
      https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639

      Steps to reproduce

      1. Create Clickhouse table with String Array field (https://clickhouse.com/).
      2. Write data to the table with scala Spark, via Clickhouse's JDBC (https://github.com/ClickHouse/clickhouse-java) 
           // code extraction, will need to setup a Scala Spark job with clickhouse jdbc
            val clickHouseSchema = StructType(
              Seq(
                StructField("str_array", ArrayType(StringType))
              )
            )
            val data = Seq(
              Row(
                Seq("a", "b")
              )
            )
        
            val clickHouseDf = spark.createDataFrame(sc.parallelize(data), clickHouseSchema)
           
            val props = new Properties
            props.put("user", "default")
            clickHouseDf.write
              .mode(SaveMode.Append)
              .option("driver", com.clickhouse.jdbc.ClickHouseDriver)
              .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props) 

        Fix

      Attachments

        Issue Links

          Activity

            People

              phanhuyn Nguyen Phan Huy
              phanhuyn Nguyen Phan Huy
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: