Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13034

Improve the performance when checking whether mapstate is empty for RocksDBStateBackend

    XMLWordPrintableJSON

Details

    • We have added a new method MapState#isEmpty() which enables users to check whether a map state is empty. The new method is 40% faster than mapState.keys().iterator().hasNext() when using the RocksDB state backend.

    Description

      Currently, there existed several scenarios to check whether map state is empty in Flink source code, e.g.TemporalRowTimeJoinOperator, AbstractRowTimeUnboundedPrecedingOver.
      Developers would use below command to check whether the map state is empty:

      boolean noRecordsToProcess = !inputState.keys().iterator().hasNext();
      

      However, if we use RocksDBStateBackend, inputState.keys().iterator().hasNext() would actually call 1 seek and 128 next actions in RocksDBMapState, in which the redundant next actions are not what we want.

      I have two options to improve this:

      • Modify RocksDBMapState back to previous design which would first load one element and then load more elements in the follow-up queries. However, this would effect the performance of other map state methods.
      • Add a isEmpty() method in the public evolving interface MapState, so that we could use it to check whether the map state is empty without any redundant RocksDB actions.

      I prefer to the 2nd option.

       

      Attachments

        Issue Links

          Activity

            People

              yunta Yun Tang
              yunta Yun Tang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m