Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44076 SPIP: Python Data Source API
  3. SPARK-46568

Python data source options should be a case insensitive dictionary

    XMLWordPrintableJSON

Details

    Description

      Data source options are stored as a `CaseInsensitiveStringMap` in Scala, however, its behavior is inconsistent in Python:

      class MyDataSource(DataSource):
          def __init__(self, options):
              self.api_key = options.get("API_KEY") # <- This is None
      
      spark.read.format(..).option("API_KEY", my_key).load(...)

      Currently, options will not have this "API_KEY" as everything is converted to lowercase on the Scala side. This can be confusing to users.

      Attachments

        Issue Links

          Activity

            People

              allisonwang-db Allison Wang
              allisonwang-db Allison Wang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: