Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36758

SQL column nullable setting not retained as part of spark read

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.1
    • None
    • PySpark
    • None
    • Databricks:

      Runtime 8.2

      Spark 3.1.1

    Description

      When reading in a column set as not null this is not retained as part of the spark.read.
      All columns are showing as nullable = true

      Is there a way to change this behaviour to retain the null setting from the source?

      See here for more info

      https://github.com/microsoft/sql-spark-connector/issues/121

       

      Example code from databricks:

      tableName = "dbo.MyTable"

      df = spark.read
      .format("com.microsoft.sqlserver.jdbc.spark")
      .option("url", myJdbcUrl)
      .option("accessToken", accessToken)
      .option("dbTable", tableName)
      .load()

      df.printSchema()

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            darrenpricedcww Darren Price
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: