Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41281 Feature parity: SparkSession API in Spark Connect
  3. SPARK-42022

createDataFrame should autogenerate missing column names

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Connect
    • None

    Description

      pyspark/sql/tests/test_types.py:233 (TypesParityTests.test_infer_schema_not_enough_names)
      ['col1', '_2'] != ['col1']
      
      Expected :['col1']
      Actual   :['col1', '_2']
      <Click to see difference>
      
      self = <pyspark.sql.tests.connect.test_parity_types.TypesParityTests testMethod=test_infer_schema_not_enough_names>
      
          def test_infer_schema_not_enough_names(self):
              df = self.spark.createDataFrame([["a", "b"]], ["col1"])
      >       self.assertEqual(df.columns, ["col1", "_2"])
      
      ../test_types.py:236: AssertionError
      
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gurwls223 Hyukjin Kwon
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment