Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-382

Write ordering guarantee violated



    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: core
    • Labels:


      The guarantee is that if the producer does
      the client see X first and Y second, but this may not actually happen in 0.8. The reason is because of the parallel I/O threads and the single queue in the network server. The current model is one work queue and one response queue per selector. The single queue is great from a parallelism point of view-if one thread is blocked another can do the work-but this actually breaks the ordering guarantee. Not sure how I missed this in the initial work.

      The reason for the single work queue was to avoid blocking a whole selector when one thread does a flush. But I wonder now how relevant that is now. If the durability guarantee comes from replication I think there is not much reason to have a blocking flush, we can rely on pdflush to do it in the background so doing the write synchronously may be fine.

      I think the solution is to modify RequestChannel to have one work queue per I/O thread and hash into the work queue by connection id. In this solution a blocked I/O thread only blocks clients that hash onto it. This retains the current async model but no longer has the property that a blocked thread doesn't block everyone. (At first I thought we didn't need a RequestChannel at all any more and could just synchronously return zero or more requests from KafkaApis, but in reality because of the possibility of request timeout from a background thread, this won't work.)

      It would also be possible to be smarter still and attempt a non-blocking solution that only preserves the write-ordering guarantees. One solution would be as follows. Each request from a given connection would be assigned an increasing number starting with 0 by the network layer. KafkaApi would keep a "last processed" number for each connection. Any request which is more than the current number for that connection + 1 would be re-enqueued. I don't like this solution because it is more complex and because I don't think blocking flushes are needed now that we have replication (e.g. you can just turn on replication and rely on pdflush which is async), so optimizing this case is not useful imo.




            • Assignee:
              jkreps Jay Kreps
              jkreps Jay Kreps
            • Votes:
              0 Vote for this issue
              3 Start watching this issue


              • Created: