Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-32598

Spill data from feedback edge to disk to avoid possible OOM

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • ml-2.4.0
    • None

    Description

      In Flink ML, we use feedback edge to implement the iteration module. Suppose the job topology is like `OpA -> HeadOperator -> OpB -> TailOperator`, then the basic process of each iteration is as follows:

      • At the first iteration, HeadOperator takes the input from OpA and forward it to OpB.
      • Later, OpB consumes the input from HeadOperator and forward the output to TailOperator.
      • Finally, TailOperator puts the records into a memory message queue and HeadOperator consumes the message queue.

      When the output from OpB contains many records and these records cannot be consumed soon, the message queue would grow big and finally lead to OOM.

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhangzp Zhipeng Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: