Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-17146

Support conversion between PyFlink Table and Pandas DataFrame

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Blocker
    • Resolution: Resolved
    • None
    • 1.11.0
    • API / Python
    • None

    Description

      Pandas dataframe is the de-facto standard to work with tabular data in Python community. PyFlink table is Flink’s representation of the tabular data in Python language. It would be nice to provide the ability to convert between the PyFlink table and Pandas dataframe in PyFlink Table API which has the following benefits:

      • It provides users the ability to switch between PyFlink and Pandas seamlessly when processing data in Python language. Users could process data using one execution engine and switch to another seamlessly. For example, it may happen that users have already got a Pandas dataframe at hand and want to perform some expensive transformation of it. Then they could convert it to a PyFlink table and leverage the power of Flink engine. Users could also convert a PyFlink table to Pandas dataframe and perform transformation of it with the rich functionalities provided by the Pandas ecosystem.
      • No intermediate connectors are needed when converting between them.

      More details could be found in FLIP-120.

      Attachments

        Issue Links

          Activity

            People

              dian.fu Dian Fu
              dian.fu Dian Fu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: