[FLINK-34923] Behavioral discrepancy between `TableEnvironment.execute_sql()` and `TableEnvironment.sql_query()` - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.19.0
Fix Version/s: None
Component/s: API / Python
Labels:
None

Language:
- python

Description

I found that there is some behavioral discrepancy between `TableEnvironment.execute_sql()` and `TableEnvironment.sql_query()`.

A minimal reproducible example:

SELECT `value` FROM (VALUES (CAST(ARRAY[ROW(1, 2), ROW(2, 2)] AS ARRAY<ROW<`a` INT, `b` INT>>))) AS `t`(`value`)

This throws

File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/table.py:943, in Table.to_pandas(self)
    939 import pytz
    940 timezone = pytz.timezone(
    941     self._j_table.getTableEnvironment().getConfig().getLocalTimeZone().getId())
    942 serializer = ArrowSerializer(
--> 943     create_arrow_schema(self.get_schema().get_field_names(),
    944                         self.get_schema().get_field_data_types()),
    945     self.get_schema().to_row_data_type(),
    946     timezone)
    947 import pyarrow as pa
    948 table = pa.Table.from_batches(serializer.load_from_iterator(batches_iterator))

File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/types.py:2194, in create_arrow_schema(field_names, field_types)
   2190 """
   2191 Create an Arrow schema with the specified filed names and types.
   2192 """
   2193 import pyarrow as pa
-> 2194 fields = [pa.field(field_name, to_arrow_type(field_type), field_type._nullable)
   2195           for field_name, field_type in zip(field_names, field_types)]
   2196 return pa.schema(fields)

File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/types.py:2194, in <listcomp>(.0)
   2190 """
   2191 Create an Arrow schema with the specified filed names and types.
   2192 """
   2193 import pyarrow as pa
-> 2194 fields = [pa.field(field_name, to_arrow_type(field_type), field_type._nullable)
   2195           for field_name, field_type in zip(field_names, field_types)]
   2196 return pa.schema(fields)

File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/types.py:2316, in to_arrow_type(data_type)
   2314 elif isinstance(data_type, ArrayType):
   2315     if type(data_type.element_type) in [LocalZonedTimestampType, RowType]:
-> 2316         raise ValueError("%s is not supported to be used as the element type of ArrayType." %
   2317                          data_type.element_type)
   2318     return pa.list_(to_arrow_type(data_type.element_type))
   2319 elif isinstance(data_type, RowType):

ValueError: ROW is not supported to be used as the element type of ArrayType.

when I tried to execute it with `TableEnvironment.sql_query()`, but works when I tried it with `TableEnvironment.execute_sql()`:

+----+--------------------------------+
| op |                          value |
+----+--------------------------------+
| +I |               [(1, 2), (2, 2)] |
+----+--------------------------------+

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Chloe He

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 23/Mar/24 00:12

Updated:: 23/Mar/24 00:12