Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-49763

[CSV Reader] Add Flag to Control Inference of Time-Only Columns as String or Timestamp During Schema Detection

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.5.2
    • None
    • SQL

    Description

      This task involves adding a configurable flag to control the inference of time-only columns during schema detection in Spark. By default, Spark converts time-only columns to Timestamp type, which can lead to unintended behavior in certain use cases. This new flag will allow users to specify whether time-only columns should be inferred as Timestamp or as String.

      Key Changes:

      • Introduce a flag (e.g., inferStringTypeForTimeOnlyColumn).
      • When the flag is set to true, time-only columns will be inferred as String.
      • When the flag is set to false (default), time-only columns will be inferred as Timestamp.
      • Update documentation to reflect the new option.
      • Ensure backward compatibility by defaulting to the current behavior.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              prbabumahesh Babu Mahesh
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: