Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-17820

Memory threshold is ignored for channel state

    XMLWordPrintableJSON

Details

    Description

      Config parameter state.backend.fs.memory-threshold is ignored for channel state. Causing each subtask to have a file per checkpoint. Regardless of the size of channel state (of this subtask).

      This also causes slow cleanup and delays the next checkpoint.

       

      The problem is that ChannelStateCheckpointWriter.finishWriteAndResult calls flush(); which actually flushes the data on disk.

       

      From FSDataOutputStream.flush Javadoc:

      A completed flush does not mean that the data is necessarily persistent. Data persistence can is only assumed after calls to close() or sync().

       

      Possible solutions:

      1. not to flush in ChannelStateCheckpointWriter.finishWriteAndResult (which can lead to data loss in a wrapping stream).

      2. change FsCheckpointStateOutputStream.flush behavior

      3. wrap FsCheckpointStateOutputStream to prevent flush}}{{

      Attachments

        Issue Links

          Activity

            People

              roman Roman Khachatryan
              roman Roman Khachatryan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: