Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-15979

Fix the merged count is not accurate in CountDistinctWithMerge

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Reopened
    • Not a Priority
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      As discussed in the user ML: https://lists.apache.org/thread.html/rc4b06c9931656c94dc993b124da3ff00f04099e41201c64788936c24%40%3Cuser.flink.apache.org%3E.

      The current implementation of org.apache.flink.table.runtime.utils.JavaUserDefinedAggFunctions.CountDistinctWithMerge#merge in old planner is not correct which will have a wrong merged count.

      The test (org.apache.flink.table.runtime.stream.table.GroupWindowITCase#testEventTimeSessionGroupWindowOverTime) which uses this UDAF can't expose the bug because there are no distinct values in the test data.

      The class CountDistinctWithMerge is a testing implementation which is not a critical problem. Blink planner has a correct implementation: https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/plan/utils/JavaUserDefinedAggFunctions.java#L369

      Attachments

        Activity

          People

            Unassigned Unassigned
            jark Jark Wu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: