Details
-
New Feature
-
Status: Closed
-
Blocker
-
Resolution: Resolved
-
None
-
None
Description
Pandas dataframe is the de-facto standard to work with tabular data in Python community. PyFlink table is Flink’s representation of the tabular data in Python language. It would be nice to provide the ability to convert between the PyFlink table and Pandas dataframe in PyFlink Table API which has the following benefits:
- It provides users the ability to switch between PyFlink and Pandas seamlessly when processing data in Python language. Users could process data using one execution engine and switch to another seamlessly. For example, it may happen that users have already got a Pandas dataframe at hand and want to perform some expensive transformation of it. Then they could convert it to a PyFlink table and leverage the power of Flink engine. Users could also convert a PyFlink table to Pandas dataframe and perform transformation of it with the rich functionalities provided by the Pandas ecosystem.
- No intermediate connectors are needed when converting between them.
More details could be found in FLIP-120.
Attachments
Issue Links
- is blocked by
-
FLINK-14807 Add TableResult#collect api for fetching data to client
- Resolved