Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48918

Create a unified SQL Scala interface shared by regular SQL and Connect.

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Epic
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Connect, SQL
    • None
    • Unified SQL Scala Interface

    Description

      Motivation

      Current the scala sql/core and connect API share the same API; connect implements a subset of the functionality of the sql/core API. The compatibility of the two implementations is enforced by MiMa checks.

      While this sort of works for application development, it is not ideal for a couple of reasons:

      • An application developer needs to pick against which API they are going to develop while setting up their project (they need to select the correct dependencies). While it is true, that they can this change later, it does put a mental burden on de the developer. A much preferred solution would be to defer binding to an implementation until you run the code.
      • (Minor) the current setup confuses IDEs, and is more of a pain to work with especially for Spark developers.
      • Developing and maintaining Spark API is more difficult because of the added burden of working with MiMa and/or adding the same API in more places.
      • Connect testing is fairly anaemic. We have seen a couple of cases where connect behaves slightly different, and this could have been detected if connect was able to leverage Spark SQLs extensive testing.

      Goals

      • Create a truly shared Scala API with two implementations. The goal is not to replace/simplify/reduce the current sql/core API we all love, the interface will only support the API shared between the implementations. An implementation can provide additional functionality (e.g. RDD centric methods for the sql/core implementation).
      • The common interface should cover all API supported by the current Connect Scala client.
      • Maintain as much binary compatibility with previous Spark releases as possible

      Design Notes

      • We are going to try to make the interface very connect centric. Where possible we will implement functionality using the connect API.
      • .... TBD

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            hvanhovell Herman van Hövell

            Dates

              Created:
              Updated:

              Slack

                Issue deployment