Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8950

"Materialize" Tables to avoid recomputation.

    XMLWordPrintableJSON

Details

    Description

      Currently, Table objects of the Table API / SQL are treated like virtual views, i.e., all relational operators that have been applied on them are recorded and translated when a Table is emitted to a TableSink or converted into a DataSet or DataStream.

      In case a Table is accessed twice, the (sub-)query that it represents is translated twice into a DataSet or DataStream program and hence also executed twice which is inefficient. Currently, the only way to avoid this is to convert the Table into a DataSet or DataStream, which will cause the optimizer to generate a plan and register it back as a new Table.

      We should offer a method to internally "materialize" a Table object, i.e., to optimize, generate a plan, and register the plan as an internal table. All queries / operations that are evaluated on the materialized Table will start from the same DataSet or DataStream such that it is not computed multiple times.

      Attachments

        Activity

          People

            Unassigned Unassigned
            fhueske Fabian Hueske
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: