Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9586

Make collection sources parallelisable

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      The note in https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/datastream_api.html#collection-data-sources

      states that Collecitons are mainly there for testing and do not support parallelism. I believe this to be an unnecessary assumption - I'm sure there are plenty of use cases that already have the data they need to distribute ready at hand. It seems strange that a fixed collection of inputs cannot be parallelised by Flink, which would require users to write their Collections into a text file and re-read them just to get parallelisation.

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            sinadoom Sina Madani

            Dates

              Created:
              Updated:

              Slack

                Issue deployment