Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Postgresql uses fixed-size commitlog files that it pre-allocates (filling with zeros) so "appending" to the log can use cheaper fsync-without-metadata (length changes is "metadata"). Then, when a commitlog is not needed, it "recycles" it by renaming it to a higher number. Commitlog entries have an increasing id, and if you come to an out-of-sequence (earlier) id, then you must have have reached the end of the commitlog and are reading from the "recycled" part.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        725d 23h 7m 1 Jonathan Ellis 06/Dec/11 04:43
        Gavin made changes -
        Workflow patch-available, re-open possible [ 12749030 ] reopen-resolved, no closed status, patch-avail, testing [ 12754006 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12484312 ] patch-available, re-open possible [ 12749030 ]
        Jonathan Ellis made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Duplicate [ 3 ]
        Hide
        Jonathan Ellis added a comment -

        moved the still-relevant multithreading part to CASSANDRA-3578

        Show
        Jonathan Ellis added a comment - moved the still-relevant multithreading part to CASSANDRA-3578
        Hide
        Jonathan Ellis added a comment -

        CASSANDRA-3411 did the segment recycling described here.

        Show
        Jonathan Ellis added a comment - CASSANDRA-3411 did the segment recycling described here.
        Hide
        Jonathan Ellis added a comment -

        reserving space for each with a [AtomicInteger] first

        For an example of something similar, look at how SlabAllocator.Region.allocate uses this approach to reserve parts of a region for the ByteBuffers it allocates.

        Show
        Jonathan Ellis added a comment - reserving space for each with a [AtomicInteger] first For an example of something similar, look at how SlabAllocator.Region.allocate uses this approach to reserve parts of a region for the ByteBuffers it allocates.
        Hide
        Jonathan Ellis added a comment -

        I think we can reasonably move "recycling" out of scope here and start with multithreaded CL append.

        And a correction to my earlier statement: sun.misc.unsafe seems to be part of the JRE spec, despite the package name. The only difference I see between NBHM's usage and the "safe" Atomic* classes is that NBHM's target is not volatile. For our purposes here AtomicInteger should be fine.

        Show
        Jonathan Ellis added a comment - I think we can reasonably move "recycling" out of scope here and start with multithreaded CL append. And a correction to my earlier statement: sun.misc.unsafe seems to be part of the JRE spec, despite the package name. The only difference I see between NBHM's usage and the "safe" Atomic* classes is that NBHM's target is not volatile. For our purposes here AtomicInteger should be fine.
        Jonathan Ellis made changes -
        Labels gsoc gsoc gsoc2010
        Jonathan Ellis made changes -
        Summary better commitlog performance Improve commitlog performance
        Labels gsoc
        Hide
        Jonathan Ellis added a comment -

        Brian Aker adds that allowing multiple threads to modify the commitlog simultaneously (reserving space for each with a CAS first) can also improve performance. Presumably mmap-ing the commitlog segment is required for this [another thing that fixed-length segments would allow].

        (CAS isn't part of the jdk spec but it is available for public use in the sun jars. Cliff Click's high-scale-lib uses it extensively, so using it wouldn't be a new dependency for us.)

        http://wiki.apache.org/cassandra/ArchitectureCommitLog describes the current CommitLog design.

        Show
        Jonathan Ellis added a comment - Brian Aker adds that allowing multiple threads to modify the commitlog simultaneously (reserving space for each with a CAS first) can also improve performance. Presumably mmap-ing the commitlog segment is required for this [another thing that fixed-length segments would allow] . (CAS isn't part of the jdk spec but it is available for public use in the sun jars. Cliff Click's high-scale-lib uses it extensively, so using it wouldn't be a new dependency for us.) http://wiki.apache.org/cassandra/ArchitectureCommitLog describes the current CommitLog design.
        Jonathan Ellis made changes -
        Field Original Value New Value
        Fix Version/s 0.6 [ 12314361 ]
        Jonathan Ellis created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development