Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39279 Fasten the schema inference of CSV/JSON data source
  3. SPARK-39280

Speed up Timestamp type inference with user-provided format in JSON/CSV data source

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.4.0
    • 3.5.0
    • SQL
    • None

    Description

      The optimization of DefaultTimestampFormatter has been implemented in #36562 , this ticket adds the optimization of user-provided format. The basic logic is to prevent the formatter from throwing exceptions, and then use catch to determine whether the parsing is successful.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            fanjia Jia Fan Assign to me
            Gengliang.Wang Gengliang Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment