Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-12308

Support python language in Flink Table API

    XMLWordPrintableJSON

    Details

      Description

      At the Flink API level, we have DataStreamAPI/DataSetAPI/TableAPI&SQL, the Table API will become the first-class citizen. Table API is declarative, and can be automatically optimized, which is mentioned in the Flink mid-term roadmap by Stephan. So, first considering supporting Python at the Table level to cater to the current large number of analytics users. And Flink's goal for Python Table API as follows:

      • Users can write Flink Table API job in Python, and should mirror Java / Scala Table API
      • Users can submit Python Table API job in the following ways:
        • Submit a job with python script, integrate with `flink run`
        • Submit a job with python script by REST service
        • Submit a job in an interactive way, similar `scala-shell`
        • Local debug in IDE.
      • Users can write custom functions(UDF, UDTF, UDAF)
      • Pandas functions can be used in Flink Python Table API

      A more detailed description can be found in FLIP-38.

      For the API level, we make the following plan:

      • The short-term:
        We may initially go with a simple approach to map the Python Table API to the Java Table API via Py4J.
      • The long-term:
        We may need to create a Python API that follows the same structure as Flink's Table API that produces the language-independent DAG. (As Stephan already motioned on the mailing thread)

        Attachments

          Activity

            People

            • Assignee:
              sunjincheng121 sunjincheng
              Reporter:
              sunjincheng121 sunjincheng
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 12h
                12h