Directory ApacheDS
  1. Directory ApacheDS
  2. DIRSERVER-1655

Possible incorrect insertion of modifications in the consumer log

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Won't Fix
    • Affects Version/s: 2.0.0-M2
    • Fix Version/s: 2.0.0-M7, 2.0.0-M8
    • Component/s: None
    • Labels:
      None

      Description

      The way we process modifications in the EventInterceptor, creating a new thread, make it possible that the modifications may be inserted into a consumer log in the wrong order.

      A possoble solution could be to use the same thread to insert modifications in the log.

        Activity

        Hide
        Emmanuel Lecharny added a comment -

        Not an issue in trunk

        Show
        Emmanuel Lecharny added a comment - Not an issue in trunk
        Hide
        Emmanuel Lecharny added a comment -

        In fact, the key is to know if we accept the idea that modifications are done sequencially, not in parallel. IMO, it's a very high constraints

        Show
        Emmanuel Lecharny added a comment - In fact, the key is to know if we accept the idea that modifications are done sequencially, not in parallel. IMO, it's a very high constraints
        Hide
        Emmanuel Lecharny added a comment -

        CSN aren't consecutive, you may insert a CSN between two existing ones.

        By consecutive, I mean you can immediately know if a CSN follows the preceding one, or not.

        Show
        Emmanuel Lecharny added a comment - CSN aren't consecutive, you may insert a CSN between two existing ones. By consecutive, I mean you can immediately know if a CSN follows the preceding one, or not.
        Hide
        Selcuk Aya added a comment - - edited

        CSN are consecutive and journal keeps the entries in sorted but the log entries are not inserted in sorted order.

        I would still go with something like this:

        ////////////////////////////////
        or maye do CSN creation under a lock and notify all consumer logs under this lock
        ///////////////////////
        notify replication even interceptor OF CSN creation ->get log locks
        generat csn,
        insert an entry with key CSN and value null
        notify replication even interceptor OF CSN creation->release log locs

        do modification

        notify replication event interceptor of modifaction
        update the entry with key csn with the real log entry

        When consumer log thread reads off this log in increasing CSN order, it will wait for a null entry to be non null.

        Show
        Selcuk Aya added a comment - - edited CSN are consecutive and journal keeps the entries in sorted but the log entries are not inserted in sorted order. I would still go with something like this: //////////////////////////////// or maye do CSN creation under a lock and notify all consumer logs under this lock /////////////////////// notify replication even interceptor OF CSN creation ->get log locks generat csn, insert an entry with key CSN and value null notify replication even interceptor OF CSN creation->release log locs do modification notify replication event interceptor of modifaction update the entry with key csn with the real log entry When consumer log thread reads off this log in increasing CSN order, it will wait for a null entry to be non null.
        Hide
        Kiran Ayyagari added a comment -

        CSN should be consecutive, and am sure it is, cause we include the timestamp and not only that we support the changeCount just incase if we create another CSN
        at the same time.

        Show
        Kiran Ayyagari added a comment - CSN should be consecutive, and am sure it is, cause we include the timestamp and not only that we support the changeCount just incase if we create another CSN at the same time.
        Hide
        Emmanuel Lecharny added a comment -

        IMO, it's not enough to rely on the fact that the journal is sorted, because CSN are not consecutive. We really need to guarantee that the modifications stored in the log are consecutive.

        One solution could be to add a transient field in the CSN clas,s containing the order of creation (incremented by the CsnFactory).

        Show
        Emmanuel Lecharny added a comment - IMO, it's not enough to rely on the fact that the journal is sorted, because CSN are not consecutive. We really need to guarantee that the modifications stored in the log are consecutive. One solution could be to add a transient field in the CSN clas,s containing the order of creation (incremented by the CsnFactory).
        Hide
        Kiran Ayyagari added a comment -

        This delay of events can happen at any level before reaching the replication interceptor, IMO the best is to let the journal keep them in sorted order (and currently this CSN based sort order is maintained in the existing JDBM based journal implementation)

        Show
        Kiran Ayyagari added a comment - This delay of events can happen at any level before reaching the replication interceptor, IMO the best is to let the journal keep them in sorted order (and currently this CSN based sort order is maintained in the existing JDBM based journal implementation)
        Hide
        Emmanuel Lecharny added a comment -

        the problem is that the CSN is created way before the modification is inserted. Locking the log could last for a (relative) long time. The other problem is that we have no clue about which log we should lock, before processing the replication filter, so we may have to lock blindly all the consumer logs.

        Can't we use a mechanism where each thread acquire a unique number, which will be used by the consumer log when it will process the mdoifications ? Something like :

        get an order number from the log
        do the modification
        post modification, push the modififcation to the log system
        the modification is inserted into the log if all the previous numbers have been processed

        ie if a mod has a number N, then it can only be inserted into the log if the N-1, N-2, ... mods have already been processed. The log will keep the latest N it has processed.

        Show
        Emmanuel Lecharny added a comment - the problem is that the CSN is created way before the modification is inserted. Locking the log could last for a (relative) long time. The other problem is that we have no clue about which log we should lock, before processing the replication filter, so we may have to lock blindly all the consumer logs. Can't we use a mechanism where each thread acquire a unique number, which will be used by the consumer log when it will process the mdoifications ? Something like : get an order number from the log do the modification post modification, push the modififcation to the log system the modification is inserted into the log if all the previous numbers have been processed ie if a mod has a number N, then it can only be inserted into the log if the N-1, N-2, ... mods have already been processed. The log will keep the latest N it has processed.
        Hide
        Selcuk Aya added a comment -

        Have thought about this issue a little bit more after discussing it. The gist of it is that we do not insert the updates to the consumer log in CSN order and this might cause updates to be skipped when sending them to consumer. Inserting the modification to the consumer log using the same thread is not enough to solve this problem because the CSN for a modification seems to be gotten without any synchronization with regard to insertion into the consumer log. So if two threads update two different entries at CSN 9 and CSN 10 for example, these two modifications can be inserted into the consumer log in any order.

        I think, for now, the solution would be to guarantee an execution order like this using a log lock:
        notify replicationeventlistener for the modification before the csn is gotten and modification is done -> replicationeventlistener gets a lock for the log
        get CSN
        do the modification
        notify replicationeventlistener after modification ->replication event listener inserts into the log and releases the log lock.

        Show
        Selcuk Aya added a comment - Have thought about this issue a little bit more after discussing it. The gist of it is that we do not insert the updates to the consumer log in CSN order and this might cause updates to be skipped when sending them to consumer. Inserting the modification to the consumer log using the same thread is not enough to solve this problem because the CSN for a modification seems to be gotten without any synchronization with regard to insertion into the consumer log. So if two threads update two different entries at CSN 9 and CSN 10 for example, these two modifications can be inserted into the consumer log in any order. I think, for now, the solution would be to guarantee an execution order like this using a log lock: notify replicationeventlistener for the modification before the csn is gotten and modification is done -> replicationeventlistener gets a lock for the log get CSN do the modification notify replicationeventlistener after modification ->replication event listener inserts into the log and releases the log lock.

          People

          • Assignee:
            Unassigned
            Reporter:
            Emmanuel Lecharny
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development