Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23202

Add new API in DataSourceWriter: onDataWriterCommit

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.4.0
    • SQL
    • None

    Description

      The current DataSourceWriter API makes it hard to implement onTaskCommit(taskCommit: TaskCommitMessage) in FileCommitProtocol.
      In general, on receiving commit message, driver can start processing messages(e.g. persist messages into files) before all the messages are collected.

      The proposal to add a new API:
      add(WriterCommitMessage message): Handles a commit message on receiving from a successful data writer.

      This should make the whole API of DataSourceWriter compatible with FileCommitProtocol, and more flexible.

      There was another radical attempt in #20386. Creating a new API as #20454 is more reasonable.

      Attachments

        Activity

          People

            Gengliang.Wang Gengliang Wang
            Gengliang.Wang Gengliang Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: