Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-34923

Behavioral discrepancy between `TableEnvironment.execute_sql()` and `TableEnvironment.sql_query()`

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.19.0
    • None
    • API / Python
    • None

    Description

      I found that there is some behavioral discrepancy between `TableEnvironment.execute_sql()` and `TableEnvironment.sql_query()`.

      A minimal reproducible example:

      SELECT `value` FROM (VALUES (CAST(ARRAY[ROW(1, 2), ROW(2, 2)] AS ARRAY<ROW<`a` INT, `b` INT>>))) AS `t`(`value`) 

      This throws

      File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/table.py:943, in Table.to_pandas(self)
          939 import pytz
          940 timezone = pytz.timezone(
          941     self._j_table.getTableEnvironment().getConfig().getLocalTimeZone().getId())
          942 serializer = ArrowSerializer(
      --> 943     create_arrow_schema(self.get_schema().get_field_names(),
          944                         self.get_schema().get_field_data_types()),
          945     self.get_schema().to_row_data_type(),
          946     timezone)
          947 import pyarrow as pa
          948 table = pa.Table.from_batches(serializer.load_from_iterator(batches_iterator))
      
      File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/types.py:2194, in create_arrow_schema(field_names, field_types)
         2190 """
         2191 Create an Arrow schema with the specified filed names and types.
         2192 """
         2193 import pyarrow as pa
      -> 2194 fields = [pa.field(field_name, to_arrow_type(field_type), field_type._nullable)
         2195           for field_name, field_type in zip(field_names, field_types)]
         2196 return pa.schema(fields)
      
      File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/types.py:2194, in <listcomp>(.0)
         2190 """
         2191 Create an Arrow schema with the specified filed names and types.
         2192 """
         2193 import pyarrow as pa
      -> 2194 fields = [pa.field(field_name, to_arrow_type(field_type), field_type._nullable)
         2195           for field_name, field_type in zip(field_names, field_types)]
         2196 return pa.schema(fields)
      
      File ~/anaconda3/envs/ibis-dev-flink/lib/python3.10/site-packages/pyflink/table/types.py:2316, in to_arrow_type(data_type)
         2314 elif isinstance(data_type, ArrayType):
         2315     if type(data_type.element_type) in [LocalZonedTimestampType, RowType]:
      -> 2316         raise ValueError("%s is not supported to be used as the element type of ArrayType." %
         2317                          data_type.element_type)
         2318     return pa.list_(to_arrow_type(data_type.element_type))
         2319 elif isinstance(data_type, RowType):
      
      ValueError: ROW is not supported to be used as the element type of ArrayType. 

      when I tried to execute it with `TableEnvironment.sql_query()`, but works when I tried it with `TableEnvironment.execute_sql()`:

      +----+--------------------------------+
      | op |                          value |
      +----+--------------------------------+
      | +I |               [(1, 2), (2, 2)] |
      +----+--------------------------------+ 

      Attachments

        Activity

          People

            Unassigned Unassigned
            chloehe Chloe He
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: