Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6656

Migrate CEP PriorityQueue to MapState

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: CEP
    • Labels:
      None

      Issue Links

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user kl0u opened a pull request:

        https://github.com/apache/flink/pull/3961

        FLINK-6656 [cep] Change element PriorityQueue to MapState.

        This is to leverage the fact that RocksDB already returns the
        keys sorted. So now elements, instead of being stores in a PQ
        and all of them being deserialized and serialized at each incoming
        element, the are stored in a MapState with the key being the
        timestamp and the value, a List of elements that refer to the
        same timestamp.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/kl0u/flink cep-map-state-pr

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/flink/pull/3961.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #3961



        Show
        githubbot ASF GitHub Bot added a comment - GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/3961 FLINK-6656 [cep] Change element PriorityQueue to MapState. This is to leverage the fact that RocksDB already returns the keys sorted. So now elements, instead of being stores in a PQ and all of them being deserialized and serialized at each incoming element, the are stored in a MapState with the key being the timestamp and the value, a List of elements that refer to the same timestamp. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink cep-map-state-pr Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3961.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3961
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user aljoscha commented on the issue:

        https://github.com/apache/flink/pull/3961

        This looks good! Why is the element processing reordered in places in the tests, though?

        Show
        githubbot ASF GitHub Bot added a comment - Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/3961 This looks good! Why is the element processing reordered in places in the tests, though?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user kl0u commented on the issue:

        https://github.com/apache/flink/pull/3961

        The re-ordering is just to verify that at watermark time, we actually order
        by event-time.

        On May 22, 2017 2:57 PM, "Aljoscha Krettek" <notifications@github.com>
        wrote:

        > This looks good! Why is the element processing reordered in places in the
        > tests, though?
        >
        > —
        > You are receiving this because you authored the thread.
        > Reply to this email directly, view it on GitHub
        > <https://github.com/apache/flink/pull/3961#issuecomment-303092277>, or mute
        > the thread
        > <https://github.com/notifications/unsubscribe-auth/ACS1qPxn5NjNu9Z9HWNxNf5klLZau-fdks5r8YY8gaJpZM4NiQwf>
        > .
        >

        Show
        githubbot ASF GitHub Bot added a comment - Github user kl0u commented on the issue: https://github.com/apache/flink/pull/3961 The re-ordering is just to verify that at watermark time, we actually order by event-time. On May 22, 2017 2:57 PM, "Aljoscha Krettek" <notifications@github.com> wrote: > This looks good! Why is the element processing reordered in places in the > tests, though? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > < https://github.com/apache/flink/pull/3961#issuecomment-303092277 >, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACS1qPxn5NjNu9Z9HWNxNf5klLZau-fdks5r8YY8gaJpZM4NiQwf > > . >
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dawidwys commented on the issue:

        https://github.com/apache/flink/pull/3961

        Just a question from my side to understand this change better. Does it really leverage the fact of keys being returned in order? To evaluate `getSortedTimestamps` you need to iterate whole keySet anyway.

        I see though it benefits from laziness of value deserialization and partitioning of the incoming events (by timestamp).

        Is my understanding correct?

        Show
        githubbot ASF GitHub Bot added a comment - Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/3961 Just a question from my side to understand this change better. Does it really leverage the fact of keys being returned in order? To evaluate `getSortedTimestamps` you need to iterate whole keySet anyway. I see though it benefits from laziness of value deserialization and partitioning of the incoming events (by timestamp). Is my understanding correct?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user kl0u commented on the issue:

        https://github.com/apache/flink/pull/3961

        Hi @dawidwys ! Your understanding is correct in the sense that it does not leverage to the fullest the fact of keys being returned in order. The reason is that this is valid only in RocksDB state backend.

        But when it comes to insertion to the PriorityQueue, this can come in handy as inserting will be fast. In addition, as you said, we only deserialize the values we need, when we need them, instead of everything all the time.

        Show
        githubbot ASF GitHub Bot added a comment - Github user kl0u commented on the issue: https://github.com/apache/flink/pull/3961 Hi @dawidwys ! Your understanding is correct in the sense that it does not leverage to the fullest the fact of keys being returned in order. The reason is that this is valid only in RocksDB state backend. But when it comes to insertion to the PriorityQueue, this can come in handy as inserting will be fast. In addition, as you said, we only deserialize the values we need, when we need them, instead of everything all the time.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dawidwys commented on the issue:

        https://github.com/apache/flink/pull/3961

        Thanks for clarification!

        Z pozdrowieniami! / Cheers!

        Dawid Wysakowicz

        Data/Software Engineer

        Skype: dawid_wys | Twitter: @OneMoreCoder

        <http://getindata.com/>

        2017-05-22 18:01 GMT+02:00 Kostas Kloudas <notifications@github.com>:

        > Hi @dawidwys <https://github.com/dawidwys> ! Your understanding is
        > correct in the sense that it does not leverage to the fullest the fact of
        > keys being returned in order. The reason is that this is valid only in
        > RocksDB state backend.
        >
        > But when it comes to insertion to the PriorityQueue, this can come in
        > handy as inserting will be fast. In addition, as you said, we only
        > deserialize the values we need, when we need them, instead of everything
        > all the time.
        >
        > —
        > You are receiving this because you were mentioned.
        > Reply to this email directly, view it on GitHub
        > <https://github.com/apache/flink/pull/3961#issuecomment-303143858>, or mute
        > the thread
        > <https://github.com/notifications/unsubscribe-auth/AF8_0_gLnOlm1xuG-VHVJTnEJx9KxVpSks5r8bFogaJpZM4NiQwf>
        > .
        >

        Show
        githubbot ASF GitHub Bot added a comment - Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/3961 Thanks for clarification! Z pozdrowieniami! / Cheers! Dawid Wysakowicz Data/Software Engineer Skype: dawid_wys | Twitter: @OneMoreCoder < http://getindata.com/ > 2017-05-22 18:01 GMT+02:00 Kostas Kloudas <notifications@github.com>: > Hi @dawidwys < https://github.com/dawidwys > ! Your understanding is > correct in the sense that it does not leverage to the fullest the fact of > keys being returned in order. The reason is that this is valid only in > RocksDB state backend. > > But when it comes to insertion to the PriorityQueue, this can come in > handy as inserting will be fast. In addition, as you said, we only > deserialize the values we need, when we need them, instead of everything > all the time. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > < https://github.com/apache/flink/pull/3961#issuecomment-303143858 >, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/AF8_0_gLnOlm1xuG-VHVJTnEJx9KxVpSks5r8bFogaJpZM4NiQwf > > . >
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user rmetzger commented on the issue:

        https://github.com/apache/flink/pull/3961

        @kl0u asked me to merge this PR to the release-1.3 branch. It seems the change is good to merge.

        Show
        githubbot ASF GitHub Bot added a comment - Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3961 @kl0u asked me to merge this PR to the release-1.3 branch. It seems the change is good to merge.
        Show
        rmetzger Robert Metzger added a comment - Merged to 1.3 in http://git-wip-us.apache.org/repos/asf/flink/commit/2bd082f8
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/flink/pull/3961

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3961
        Hide
        kkl0u Kostas Kloudas added a comment -

        Merged with 546e2ad73165b00d93c3c460372b9d49a4b5d8b7

        Show
        kkl0u Kostas Kloudas added a comment - Merged with 546e2ad73165b00d93c3c460372b9d49a4b5d8b7

          People

          • Assignee:
            kkl0u Kostas Kloudas
            Reporter:
            kkl0u Kostas Kloudas
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development