Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8413

Snapshot state of aggregated data is not maintained in flink's checkpointing

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • 1.3.2
    • None
    • None

    Description

      We have a project which consumes events from kafka,does a groupby in a time window(5 mins),after window elapses it pushes the events to downstream for merge.This project is deployed using flink ,we have enabled checkpointing to recover from failed state.

      (windowsize: 5mins , checkpointingInterval: 5mins,state.backend: filesystem)

      Offsets from kafka get checkpointed every 5 mins(checkpointingInterval).Before finishing the entire DAG(groupBy and merge) , events offsets are getting checkpointed.So incase of any restart from task-manager ,new task gets started from last successful checkpoint ,but we could'nt able to get the aggregated snapshot data(data from groupBy task) from the persisted checkpoint.

      Able to retrieve the last successful checkpointed offset from kafka ,but couldnt able to get last aggregated data till checkpointing.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              suganyap suganya
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: