Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Postgresql uses fixed-size commitlog files that it pre-allocates (filling with zeros) so "appending" to the log can use cheaper fsync-without-metadata (length changes is "metadata"). Then, when a commitlog is not needed, it "recycles" it by renaming it to a higher number. Commitlog entries have an increasing id, and if you come to an out-of-sequence (earlier) id, then you must have have reached the end of the commitlog and are reading from the "recycled" part.

        Activity

        Hide
        Jonathan Ellis added a comment -

        moved the still-relevant multithreading part to CASSANDRA-3578

        Show
        Jonathan Ellis added a comment - moved the still-relevant multithreading part to CASSANDRA-3578
        Hide
        Jonathan Ellis added a comment -

        CASSANDRA-3411 did the segment recycling described here.

        Show
        Jonathan Ellis added a comment - CASSANDRA-3411 did the segment recycling described here.
        Hide
        Jonathan Ellis added a comment -

        reserving space for each with a [AtomicInteger] first

        For an example of something similar, look at how SlabAllocator.Region.allocate uses this approach to reserve parts of a region for the ByteBuffers it allocates.

        Show
        Jonathan Ellis added a comment - reserving space for each with a [AtomicInteger] first For an example of something similar, look at how SlabAllocator.Region.allocate uses this approach to reserve parts of a region for the ByteBuffers it allocates.
        Hide
        Jonathan Ellis added a comment -

        I think we can reasonably move "recycling" out of scope here and start with multithreaded CL append.

        And a correction to my earlier statement: sun.misc.unsafe seems to be part of the JRE spec, despite the package name. The only difference I see between NBHM's usage and the "safe" Atomic* classes is that NBHM's target is not volatile. For our purposes here AtomicInteger should be fine.

        Show
        Jonathan Ellis added a comment - I think we can reasonably move "recycling" out of scope here and start with multithreaded CL append. And a correction to my earlier statement: sun.misc.unsafe seems to be part of the JRE spec, despite the package name. The only difference I see between NBHM's usage and the "safe" Atomic* classes is that NBHM's target is not volatile. For our purposes here AtomicInteger should be fine.
        Hide
        Jonathan Ellis added a comment -

        Brian Aker adds that allowing multiple threads to modify the commitlog simultaneously (reserving space for each with a CAS first) can also improve performance. Presumably mmap-ing the commitlog segment is required for this [another thing that fixed-length segments would allow].

        (CAS isn't part of the jdk spec but it is available for public use in the sun jars. Cliff Click's high-scale-lib uses it extensively, so using it wouldn't be a new dependency for us.)

        http://wiki.apache.org/cassandra/ArchitectureCommitLog describes the current CommitLog design.

        Show
        Jonathan Ellis added a comment - Brian Aker adds that allowing multiple threads to modify the commitlog simultaneously (reserving space for each with a CAS first) can also improve performance. Presumably mmap-ing the commitlog segment is required for this [another thing that fixed-length segments would allow] . (CAS isn't part of the jdk spec but it is available for public use in the sun jars. Cliff Click's high-scale-lib uses it extensively, so using it wouldn't be a new dependency for us.) http://wiki.apache.org/cassandra/ArchitectureCommitLog describes the current CommitLog design.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development