Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-1444

Add data properties for data sources

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Implemented
    • 0.9
    • None
    • None

    Description

      This issue proposes to add support for attaching data properties to data sources. These data properties are defined with respect to input splits.
      Possible properties are:

      • partitioning across splits: all elements of the same key (combination) are contained in one split
      • sorting / grouping with splits: elements are sorted or grouped on certain keys within a split
      • key uniqueness: a certain key (combination) is unique for all elements of the data source. This property is not defined wrt. input splits.

      The optimizer can leverage this information to generate more efficient execution plans.

      The InputFormat will be responsible to generate input splits such that the promised data properties are actually in place. Otherwise, the program will produce invalid results.

      Attachments

        Activity

          People

            fhueske Fabian Hueske
            fhueske Fabian Hueske
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: