Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-658

Iterator.remove breaks caching layer

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.0, 0.9.1
    • Component/s: kv
    • Labels:

      Description

      If you get an iterator from the underlying layer from any store (InMemory), we just return the underlying iterator. Upon invoking remove operation on the iterator, we just delete the value from the underlying store.

      InMemory:

            override def remove(): Unit = iter.remove()
      

      This removes it from the store, but the cache still has the value and returns a value when requested for the key in the subsequent calls for the key. When the iterator.remove() is invoked, we need make sure that all the layers remove the key before we remove it from the underlying store.

      RocksDB doesn't support remove on iterator, so it's probably only for inmemory store we have this issue.

          def remove() = throw new UnsupportedOperationException("RocksDB iterator doesn't support remove")
      
      1. samza-658.0.9.1.patch
        19 kB
        Yi Pan (Data Infrastructure)
      2. SAMZA-658.v3.patch
        21 kB
        Guozhang Wang
      3. SAMZA-658.v4.patch
        21 kB
        Guozhang Wang

        Activity

        Hide
        guozhang Guozhang Wang added a comment -

        Naveen which caching layer are you referring to? I would like to take on this JIRA to better understand the storage design.

        Show
        guozhang Guozhang Wang added a comment - Naveen which caching layer are you referring to? I would like to take on this JIRA to better understand the storage design.
        Hide
        naveenatceg Naveen Somasundaram added a comment -

        Hey Guozhang Wang, sure, refer to CachedStore.scala. Every access to the underlying store will go through this Cache.

        Show
        naveenatceg Naveen Somasundaram added a comment - Hey Guozhang Wang , sure, refer to CachedStore.scala. Every access to the underlying store will go through this Cache.
        Hide
        guozhang Guozhang Wang added a comment -
        Show
        guozhang Guozhang Wang added a comment - Created RB: https://reviews.apache.org/r/33761/
        Hide
        closeuris Yan Fang added a comment -

        Any way to test this fix?

        Show
        closeuris Yan Fang added a comment - Any way to test this fix?
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        Overall looks good. I had some comments to RB. Thanks!

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - Overall looks good. I had some comments to RB. Thanks!
        Hide
        guozhang Guozhang Wang added a comment -

        Yan Fang This bug was reported by some users at LI using iterator.remove, we can verify with him by extracting his program logic.

        Show
        guozhang Guozhang Wang added a comment - Yan Fang This bug was reported by some users at LI using iterator.remove, we can verify with him by extracting his program logic.
        Hide
        closeuris Yan Fang added a comment -

        Yes, that works. Also, I think, as Yi Pan mentioned, a unit test helps as well. Thanks.

        Show
        closeuris Yan Fang added a comment - Yes, that works. Also, I think, as Yi Pan mentioned, a unit test helps as well. Thanks.
        Hide
        guozhang Guozhang Wang added a comment -

        Updated https://reviews.apache.org/r/33761/ addressing Yi's comments.

        Show
        guozhang Guozhang Wang added a comment - Updated https://reviews.apache.org/r/33761/ addressing Yi's comments.
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        Guozhang Wang, could you upload your patch to the JIRA? Thanks!

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - Guozhang Wang , could you upload your patch to the JIRA? Thanks!
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        Also, I tried to apply the latest patch on top of master and it seems that we need to re-base the patch. Thanks!

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - Also, I tried to apply the latest patch on top of master and it seems that we need to re-base the patch. Thanks!
        Hide
        guozhang Guozhang Wang added a comment -

        Updated https://reviews.apache.org/r/33761/ for some minor fixes.

        Show
        guozhang Guozhang Wang added a comment - Updated https://reviews.apache.org/r/33761/ for some minor fixes.
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        +1 on RB. Thanks!

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - +1 on RB. Thanks!
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        Merged and committed.

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - Merged and committed.
        Hide
        guozhang Guozhang Wang added a comment -
        Show
        guozhang Guozhang Wang added a comment - Thanks Yi Pan (Data Infrastructure) !
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        Re-opening for 0.9.1 release

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - Re-opening for 0.9.1 release
        Hide
        jghoman Jakob Homan added a comment -

        +1 on backport.

        Show
        jghoman Jakob Homan added a comment - +1 on backport.
        Hide
        nickpan47 Yi Pan (Data Infrastructure) added a comment -

        merged and committed to 0.9.1.

        Thanks!

        Show
        nickpan47 Yi Pan (Data Infrastructure) added a comment - merged and committed to 0.9.1. Thanks!

          People

          • Assignee:
            guozhang Guozhang Wang
            Reporter:
            nsomasun Naveen
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development