Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 1.0.0
    • Component/s: None
    • Labels:
      None

      Description

      currently if a row contains > 1000 columns, the run time becomes considerably slow (my test of
      a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms.
      this is all running in memory, no disk read is involved.

      through debugging we can find
      most of this time is spent on
      [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter)
      [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily)
      [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily)
      [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily)
      [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int)
      [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int)
      [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)

      ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values.
      this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map.

      but if we look at the whole read path, thrift already defines the read output to be List<ColumnOrSuperColumn> so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed.

      but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper.

      the provided patch is for demonstration now, will work further once we agree on the general direction.
      CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile.

      slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF.

      this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case.

      thanks
      Yang

      1. 2843_h.patch
        69 kB
        Sylvain Lebresne
      2. 2843_g.patch
        71 kB
        Sylvain Lebresne
      3. fix.diff
        1 kB
        Yang Yang
      4. 2843_d.patch
        53 kB
        Yang Yang
      5. patch_timing
        8 kB
        Yang Yang
      6. std_timing
        8 kB
        Yang Yang
      7. microBenchmark.patch
        9 kB
        Sylvain Lebresne
      8. 2843.patch
        47 kB
        Sylvain Lebresne

        Activity

        Hide
        Yang Yang added a comment -

        diff file

        Show
        Yang Yang added a comment - diff file
        Hide
        Yang Yang added a comment -

        just untar this file into the 0.8.0-rc1 source tree, then compile

        Show
        Yang Yang added a comment - just untar this file into the 0.8.0-rc1 source tree, then compile
        Hide
        Sylvain Lebresne added a comment -

        The usual way to do thinks is to attach a patch. But I see that your diff don't include FastColumnFamily. The patch also include instrumentation and a few unrelated chances (a commented method, a change from SortedSet to Set in an unrelated method signature) that would ideally be removed. It would be great to have this rebase to the current 0.8 branch too.

        Show
        Sylvain Lebresne added a comment - The usual way to do thinks is to attach a patch. But I see that your diff don't include FastColumnFamily. The patch also include instrumentation and a few unrelated chances (a commented method, a change from SortedSet to Set in an unrelated method signature) that would ideally be removed. It would be great to have this rebase to the current 0.8 branch too.
        Hide
        Yang Yang added a comment -

        the provided patch is for demonstration now, will work further once we agree on the general direction.

        CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile.

        slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF.

        this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case.

        Show
        Yang Yang added a comment - the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case.
        Hide
        Yang Yang added a comment -

        thanks Sylvain.

        I changed the patch to be based on current svn trunk.

        (sorry the last attempt was based on the 080-rc1 tar ball, I did not know how to include a new file in "diff -uw -r ", so had to include the FastColumnFamily.java in a tar ball)

        sorry the last SortedSet change was a typo... I once changed SortedSet to Set when I tried to use the cheaper HashMap, but later removed it when I used array.

        a lot of the FastColumnFamily methods are not implemented now, but the basic functionality is there for demonstration of the idea

        Show
        Yang Yang added a comment - thanks Sylvain. I changed the patch to be based on current svn trunk. (sorry the last attempt was based on the 080-rc1 tar ball, I did not know how to include a new file in "diff -uw -r ", so had to include the FastColumnFamily.java in a tar ball) sorry the last SortedSet change was a typo... I once changed SortedSet to Set when I tried to use the cheaper HashMap, but later removed it when I used array. a lot of the FastColumnFamily methods are not implemented now, but the basic functionality is there for demonstration of the idea
        Hide
        Yang Yang added a comment -

        right now design wise, the thing I'm most not sure about is where to properly inject the returnCF.

        also on a bigger scale, the multiple levels of
        collateIterator
        reducingIterator
        ColumnFamily
        Table.getRow()

        could probably be looked at from a more wholistic view, so that less internal conversions are done. my patch makes a small try in this step, but probably more can be done: for example getRow() converts the CFS.getSortedColumns() into another List by thriftifyColumns(). instead of list, we may just let FastColumnFamily pass the original iterators, and thriftify directly uses the iterator, instead of through the FastColumnFamily.columns_array. this time saving could be small though, since array is already very cheap.

        Show
        Yang Yang added a comment - right now design wise, the thing I'm most not sure about is where to properly inject the returnCF. also on a bigger scale, the multiple levels of collateIterator reducingIterator ColumnFamily Table.getRow() could probably be looked at from a more wholistic view, so that less internal conversions are done. my patch makes a small try in this step, but probably more can be done: for example getRow() converts the CFS.getSortedColumns() into another List by thriftifyColumns(). instead of list, we may just let FastColumnFamily pass the original iterators, and thriftify directly uses the iterator, instead of through the FastColumnFamily.columns_array. this time saving could be small though, since array is already very cheap.
        Hide
        Sylvain Lebresne added a comment -

        I do think that using a specific implementation for the backing map of a ColumnFamily during is a good idea. It is clear that avoiding synchronization will be faster, and given the type of operations we do during reads (insertion in sorted order and iteration), an ArrayList backed solution is sure to be faster too. I will also be much gentle on the GC that the linked list ConcurrentSkipListMap uses. I think that all those will help even with relatively small reads. So let's focus on that for this ticket and let other potential improvement to other ticket, especially if it is unclear they bear any noticeable speedup.

        So focusing on the patch itself:

        • We really shouldn't "simply" extend ColumnFamily as the patch does. This is quite frankly ugly and will be a maintenance nightmare (you'll have to check you did overwrite every function that touch the map (which is not the case in the patch) and every update to ColumnFamily have to be aware that it should update FastColumnFamily as well).
        • The implementation of FastColumnFamily should really be a fully functionnal ColumnFamily implementation (albeit not synchronized). That is, we can't assume that addition will always be in strict increasing order, otherwise again this will be too hard to use.
        • The addAll function can be optimized given that both input are sorted. Granted, I don't think it is used in the read path, but I think that the new ColumnFamily implementation could advantageously be used during compaction (by preCompactedRow typically, and possibly other places where concurrent access is not an issue) where this would matter.

        Attaching a version of the patch (2843.patch) that tries to address all the remarks above. The patch is against trunk (not 0.8 branch), because it build on the recently committed refactor of ColumnFamily. It refactors ColumnFamily (AbstractColumnContainer actually) to allow for a pluggable backing column map. The ConcurrentSkipListMap implemn is name ThreadSafeColumnMap and the new one is called ArrayBackedColumnMap (which I prefer to FastSomething since it's not a very helpful name).

        On the ColumnFamilyStore side, instead of feeding the returnCF to getTopLevelColumns, I pass along a factory (that each backing implementation provides). The main goal was to avoid creating a columnFamily when it's useless (if row cache is enabled on the CF – btw, this ticket only improve on read for column family with no cache).

        Micro benchmarks does show that on the operation involved during a read (addition of column + iteration), the ArrayBacked implementation is faster than the ConcurrentSkipListMap based one. Interestingly though, this is mainly true when some reconciliation of columns happens. That is, if you only add columns with different names, the ArrayBacked implementation is faster, but not dramatically so. If you start adding column that have to be resolved, the ArrayBacked implementation becomes much faster, even with a reasonably small number of columns (inserting 100 columns with only 10 unique column names, the ArrayBacked is already >30% faster). And this mostly due to the overhead of synchronization (of replace()): a TreeMap based implementation is slightly slower than the ArrayBacked one but not by a lot and thus is much faster than the ConcurrentSkipListMap implementation.

        The attached patch should be ready for review (though it could probably use a few unit test for the new ArrayBacked implementation).

        Show
        Sylvain Lebresne added a comment - I do think that using a specific implementation for the backing map of a ColumnFamily during is a good idea. It is clear that avoiding synchronization will be faster, and given the type of operations we do during reads (insertion in sorted order and iteration), an ArrayList backed solution is sure to be faster too. I will also be much gentle on the GC that the linked list ConcurrentSkipListMap uses. I think that all those will help even with relatively small reads. So let's focus on that for this ticket and let other potential improvement to other ticket, especially if it is unclear they bear any noticeable speedup. So focusing on the patch itself: We really shouldn't "simply" extend ColumnFamily as the patch does. This is quite frankly ugly and will be a maintenance nightmare (you'll have to check you did overwrite every function that touch the map (which is not the case in the patch) and every update to ColumnFamily have to be aware that it should update FastColumnFamily as well). The implementation of FastColumnFamily should really be a fully functionnal ColumnFamily implementation (albeit not synchronized). That is, we can't assume that addition will always be in strict increasing order, otherwise again this will be too hard to use. The addAll function can be optimized given that both input are sorted. Granted, I don't think it is used in the read path, but I think that the new ColumnFamily implementation could advantageously be used during compaction (by preCompactedRow typically, and possibly other places where concurrent access is not an issue) where this would matter. Attaching a version of the patch (2843.patch) that tries to address all the remarks above. The patch is against trunk (not 0.8 branch), because it build on the recently committed refactor of ColumnFamily. It refactors ColumnFamily (AbstractColumnContainer actually) to allow for a pluggable backing column map. The ConcurrentSkipListMap implemn is name ThreadSafeColumnMap and the new one is called ArrayBackedColumnMap (which I prefer to FastSomething since it's not a very helpful name). On the ColumnFamilyStore side, instead of feeding the returnCF to getTopLevelColumns, I pass along a factory (that each backing implementation provides). The main goal was to avoid creating a columnFamily when it's useless (if row cache is enabled on the CF – btw, this ticket only improve on read for column family with no cache). Micro benchmarks does show that on the operation involved during a read (addition of column + iteration), the ArrayBacked implementation is faster than the ConcurrentSkipListMap based one. Interestingly though, this is mainly true when some reconciliation of columns happens. That is, if you only add columns with different names, the ArrayBacked implementation is faster, but not dramatically so. If you start adding column that have to be resolved, the ArrayBacked implementation becomes much faster, even with a reasonably small number of columns (inserting 100 columns with only 10 unique column names, the ArrayBacked is already >30% faster). And this mostly due to the overhead of synchronization (of replace()): a TreeMap based implementation is slightly slower than the ArrayBacked one but not by a lot and thus is much faster than the ConcurrentSkipListMap implementation. The attached patch should be ready for review (though it could probably use a few unit test for the new ArrayBacked implementation).
        Hide
        Jonathan Ellis added a comment -

        a TreeMap based implementation is slightly slower than the ArrayBacked one but not by a lot

        If the main benefit is avoiding synchronization, shouldn't we just stick w/ TreeMap for simplicity?

        That is, if you only add columns with different names, the ArrayBacked implementation is faster, but not dramatically so

        That's odd, because adding in sorted order is actually worst-case for CSLM, and that's what we do on reads. I would expect we could do better than just the synchronization difference. Did you try wide slices too?

        FWIW, back in CASSANDRA-633 I tried something similar on the write path, with the additional difference that it would only sort lazily (i.e. on flush, for an insert-mostly workload). That didn't seem to be much improvement either.

        Show
        Jonathan Ellis added a comment - a TreeMap based implementation is slightly slower than the ArrayBacked one but not by a lot If the main benefit is avoiding synchronization, shouldn't we just stick w/ TreeMap for simplicity? That is, if you only add columns with different names, the ArrayBacked implementation is faster, but not dramatically so That's odd, because adding in sorted order is actually worst-case for CSLM, and that's what we do on reads. I would expect we could do better than just the synchronization difference. Did you try wide slices too? FWIW, back in CASSANDRA-633 I tried something similar on the write path, with the additional difference that it would only sort lazily (i.e. on flush, for an insert-mostly workload). That didn't seem to be much improvement either.
        Hide
        Yang Yang added a comment - - edited

        Actually my tests did not use reconcile. All col names are uniq

        Show
        Yang Yang added a comment - - edited Actually my tests did not use reconcile. All col names are uniq
        Hide
        Sylvain Lebresne added a comment -

        "this implementation should not expect the input always come in already sorted order"

        Yeah, that's not really what I said. I said that it should not assume it. Which doesn't mean it cannot optimize for it. If you look at the version I attached, at least as far a addColumn is concerned, it does the exact same thing as your version, with the only difference that I first check if adding at the tail is legit and fail back to a binary search if that is not the case. That is, as long as the input is in sorted order, it will be as fast as your implementation (there is one more bytebuffer comparison but I'm willing to bet that it has no noticeable impact on performance). But it won't create unsorted CF if the input is not in sorted order.

        Btw, Yang, can you try to fix you 3 last comments ?

        If the main benefit is avoiding synchronization, shouldn't we just stick w/ TreeMap for simplicity?

        I'm attaching a second patch with the micro-benchmark that I've used and the TreeMap implementation so that people can look for themselves. The test simply creates a CF, add columns to it (in sorted order) and do a simple iteration at the end. I've also add a delete at the end because at least in the case of super columns, we do call removeDeleted so the goal was to see if this has a significant impact (the deletes are made at the beginning of the CF, which is the worst case for the ArrayBacked solution). The test also allow to have some column overwrap (to exercise reconciliation). Not that when that happens, the input is not in strict sorted order anymore, but it's mostly at the disadvantage of the ArrayBack implementation there too. Playing with the parameters (number of columns added, number that overlaps, number of deletes) the results seems to always be the same. The ArrayBacked is consistently faster than the TreeMap one that is itself consistently faster than the CSLM one. Now what I meant is that the difference between ArrayBacked and TreeMap is generally not as big as the one with CSLM, but it is still often very noticeable.

        This is no surprise in the end, the ArrayBacked solution is optimized for insertion in sorted order: the insertion is then O(1) and with a small constant factor because we're using ArrayList. TreeMap can't beat that. Given this, and given that ColumnFamily is one of our core data structure, I think we should choose the more efficient implementation for each use case. And truth is, the ArrayBacked implementation is really not very complicated, that's basic stuff.

        That's odd, because adding in sorted order is actually worst-case for CSLM, and that's what we do on reads.

        Yeah, I was a bit quick on that statement. Rerunning my micro-benchmarks does show that we're much much faster even without reconciliation happening.

        Show
        Sylvain Lebresne added a comment - "this implementation should not expect the input always come in already sorted order" Yeah, that's not really what I said. I said that it should not assume it. Which doesn't mean it cannot optimize for it. If you look at the version I attached, at least as far a addColumn is concerned, it does the exact same thing as your version, with the only difference that I first check if adding at the tail is legit and fail back to a binary search if that is not the case. That is, as long as the input is in sorted order, it will be as fast as your implementation (there is one more bytebuffer comparison but I'm willing to bet that it has no noticeable impact on performance). But it won't create unsorted CF if the input is not in sorted order. Btw, Yang, can you try to fix you 3 last comments ? If the main benefit is avoiding synchronization, shouldn't we just stick w/ TreeMap for simplicity? I'm attaching a second patch with the micro-benchmark that I've used and the TreeMap implementation so that people can look for themselves. The test simply creates a CF, add columns to it (in sorted order) and do a simple iteration at the end. I've also add a delete at the end because at least in the case of super columns, we do call removeDeleted so the goal was to see if this has a significant impact (the deletes are made at the beginning of the CF, which is the worst case for the ArrayBacked solution). The test also allow to have some column overwrap (to exercise reconciliation). Not that when that happens, the input is not in strict sorted order anymore, but it's mostly at the disadvantage of the ArrayBack implementation there too. Playing with the parameters (number of columns added, number that overlaps, number of deletes) the results seems to always be the same. The ArrayBacked is consistently faster than the TreeMap one that is itself consistently faster than the CSLM one. Now what I meant is that the difference between ArrayBacked and TreeMap is generally not as big as the one with CSLM, but it is still often very noticeable. This is no surprise in the end, the ArrayBacked solution is optimized for insertion in sorted order: the insertion is then O(1) and with a small constant factor because we're using ArrayList. TreeMap can't beat that. Given this, and given that ColumnFamily is one of our core data structure, I think we should choose the more efficient implementation for each use case. And truth is, the ArrayBacked implementation is really not very complicated, that's basic stuff. That's odd, because adding in sorted order is actually worst-case for CSLM, and that's what we do on reads. Yeah, I was a bit quick on that statement. Rerunning my micro-benchmarks does show that we're much much faster even without reconciliation happening.
        Hide
        Jonathan Ellis added a comment -

        I first check if adding at the tail is legit and fail back to a binary search if that is not the case ... we should choose the more efficient implementation for each use case

        I agree with the second part, but doesn't that imply that if we're doing non-sorted inserts that we should be using the CSLM or TM version instead of trying to support everything from the AL version?

        Show
        Jonathan Ellis added a comment - I first check if adding at the tail is legit and fail back to a binary search if that is not the case ... we should choose the more efficient implementation for each use case I agree with the second part, but doesn't that imply that if we're doing non-sorted inserts that we should be using the CSLM or TM version instead of trying to support everything from the AL version?
        Hide
        Sylvain Lebresne added a comment -

        but doesn't that imply that if we're doing non-sorted inserts that we should be using the CSLM or TM version instead of trying to support everything from the AL version

        Totally agree. I was going to say that following this, we should look at using a non-synchronized implementation in other parts like compaction, but there it may be better to use the TM one. But for reads, I think the AL one should be the fastest.

        Show
        Sylvain Lebresne added a comment - but doesn't that imply that if we're doing non-sorted inserts that we should be using the CSLM or TM version instead of trying to support everything from the AL version Totally agree. I was going to say that following this, we should look at using a non-synchronized implementation in other parts like compaction, but there it may be better to use the TM one. But for reads, I think the AL one should be the fastest.
        Hide
        Jonathan Ellis added a comment -

        Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure?

        Show
        Jonathan Ellis added a comment - Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure?
        Hide
        Yang Yang added a comment -

        I see a typo in Sylvain's patch, and added a comment on ArrayBackedColumnMap: its addAll() does assume sorted input

        added a simple test

        the incremental change I made is in "incremental.diff",

        the diff against 81f1e56062a51e67ebe5a657ba94d3f37a1903e6 in git is in 2843_b.patch

        Show
        Yang Yang added a comment - I see a typo in Sylvain's patch, and added a comment on ArrayBackedColumnMap: its addAll() does assume sorted input added a simple test the incremental change I made is in "incremental.diff", the diff against 81f1e56062a51e67ebe5a657ba94d3f37a1903e6 in git is in 2843_b.patch
        Hide
        Yang Yang added a comment -

        patch against the same base as Sylvain's patch, fixing one typo

        Show
        Yang Yang added a comment - patch against the same base as Sylvain's patch, fixing one typo
        Hide
        Yang Yang added a comment -

        the minor typo fix on top of* Sylvain's patch, for easier reading

        Show
        Yang Yang added a comment - the minor typo fix on top of * Sylvain's patch, for easier reading
        Hide
        Yang Yang added a comment -

        also I see that the getSortedColumns() and getSortedColumnNames() returns an iterator whose remove() operates on the outer array, aren't these 2 methods supposed to return a new collection obj, so that the outer array is immutable ? right now we don't seem to have real usages for the remove() so it could be better to just let them throw UnsupportedOperationException??

        Show
        Yang Yang added a comment - also I see that the getSortedColumns() and getSortedColumnNames() returns an iterator whose remove() operates on the outer array, aren't these 2 methods supposed to return a new collection obj, so that the outer array is immutable ? right now we don't seem to have real usages for the remove() so it could be better to just let them throw UnsupportedOperationException??
        Hide
        Brandon Williams added a comment -

        2843_b.patch fails to apply a chunk to trunk in src/java/org/apache/cassandra/db/ColumnFamily, can you rebase?

        Show
        Brandon Williams added a comment - 2843_b.patch fails to apply a chunk to trunk in src/java/org/apache/cassandra/db/ColumnFamily, can you rebase?
        Hide
        Yang Yang added a comment -

        rebased against 4629648899e637e8e03938935f126689cce5ad48

        also fixed a bug in my test,
        the AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse gives a compile error

        Show
        Yang Yang added a comment - rebased against 4629648899e637e8e03938935f126689cce5ad48 also fixed a bug in my test, the AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse gives a compile error
        Hide
        Brandon Williams added a comment -

        I did some performance testing using Aaron's script here: http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and overall in the 95th percentile there was an approximate 10% gain across the board.

        Show
        Brandon Williams added a comment - I did some performance testing using Aaron's script here: http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and overall in the 95th percentile there was an approximate 10% gain across the board.
        Hide
        Yang Yang added a comment - - edited

        let me go through that benchmark and will comment later.
        Yang

        Show
        Yang Yang added a comment - - edited let me go through that benchmark and will comment later. Yang
        Hide
        Yang Yang added a comment -

        Brandon:

        where did you download the actual benchmark scripts? the link above does not have them.

        thanks
        yang

        Show
        Yang Yang added a comment - Brandon: where did you download the actual benchmark scripts? the link above does not have them. thanks yang
        Hide
        Brandon Williams added a comment -

        The link is there, but it's buried down the page. Here it is: https://gist.github.com/1074715

        Show
        Brandon Williams added a comment - The link is there, but it's buried down the page. Here it is: https://gist.github.com/1074715
        Hide
        Sylvain Lebresne added a comment -

        Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure?

        I kind of have a small preference for not turning it into a failure, but if we don't see a use case where we would want to use AL even though not all input are sorted in the near future, I could be convinced to doing it for now. There is one such use case though that I think could be reasonable, and that's the new SSTableSimple*Writer classes introduced by CASSANDRA-2911. The thing is, you need to have quite a bit of input out of order for AL to start being slower than the alternatives (even a TreeMap one). I tried a modified version of my micro-benchmark adding columns in reverse sorted order (worst case for AL), and on my machine, you need to add 1000 columns to start seeing AL slower than the others. Which leads me to believe that AL could be a good default for the CF used in SSTableSimple*Writer.

        I see a typo in Sylvain's patch, and added a comment on ArrayBackedColumnMap: its addAll() does assume sorted input

        AddAll does assume sorted input, but that's because IColumnMap is meant to be a sorted map (IColumnMap doesn't extend SortedMap for technical reason, but it is really meant as a SortedMap). That is, I agree that comments could be improved, but I think that it is the IColumnMap one that needs to be fixed by specifying that the iterator should iter in sorted order. Anyway, that cosmetic and could be changed during commit if not before.

        AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse gives a compile error

        Is there another reason than "otherwise eclipse gives a compile error". Because it doesn't seem very related to this patch and I'm keen on not changing stuff randomly without understanding why.

        in the 95th percentile there was an approximate 10% gain across the board

        Great, thanks Brandon for the testing.

        Show
        Sylvain Lebresne added a comment - Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure? I kind of have a small preference for not turning it into a failure, but if we don't see a use case where we would want to use AL even though not all input are sorted in the near future, I could be convinced to doing it for now. There is one such use case though that I think could be reasonable, and that's the new SSTableSimple*Writer classes introduced by CASSANDRA-2911 . The thing is, you need to have quite a bit of input out of order for AL to start being slower than the alternatives (even a TreeMap one). I tried a modified version of my micro-benchmark adding columns in reverse sorted order (worst case for AL), and on my machine, you need to add 1000 columns to start seeing AL slower than the others. Which leads me to believe that AL could be a good default for the CF used in SSTableSimple*Writer. I see a typo in Sylvain's patch, and added a comment on ArrayBackedColumnMap: its addAll() does assume sorted input AddAll does assume sorted input, but that's because IColumnMap is meant to be a sorted map (IColumnMap doesn't extend SortedMap for technical reason, but it is really meant as a SortedMap). That is, I agree that comments could be improved, but I think that it is the IColumnMap one that needs to be fixed by specifying that the iterator should iter in sorted order. Anyway, that cosmetic and could be changed during commit if not before. AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse gives a compile error Is there another reason than "otherwise eclipse gives a compile error". Because it doesn't seem very related to this patch and I'm keen on not changing stuff randomly without understanding why. in the 95th percentile there was an approximate 10% gain across the board Great, thanks Brandon for the testing.
        Hide
        Jonathan Ellis added a comment -

        I think that it is the IColumnMap one that needs to be fixed by specifying that the iterator should iter in sorted order

        You mean just as a javadoc comment?

        Show
        Jonathan Ellis added a comment - I think that it is the IColumnMap one that needs to be fixed by specifying that the iterator should iter in sorted order You mean just as a javadoc comment?
        Hide
        Yang Yang added a comment -

        Brandon:

        I used commit 4629648899e637e8e03938935f126689cce5ad48 and applied the 2843_c.patch, and also tried the head, but got the following error with the benchmark pycassa script. how did you succeed with it?

        [default@query] create column family NoCache
        ... with comparator = AsciiType
        ... and default_validation_class = AsciiType
        ... and key_validation_class = AsciiType
        ... and keys_cached = 0
        ... and rows_cached = 0;
        Unable to set Compaction Strategy Class of AsciiType

        thanks
        Yang

        Show
        Yang Yang added a comment - Brandon: I used commit 4629648899e637e8e03938935f126689cce5ad48 and applied the 2843_c.patch, and also tried the head, but got the following error with the benchmark pycassa script. how did you succeed with it? [default@query] create column family NoCache ... with comparator = AsciiType ... and default_validation_class = AsciiType ... and key_validation_class = AsciiType ... and keys_cached = 0 ... and rows_cached = 0; Unable to set Compaction Strategy Class of AsciiType thanks Yang
        Hide
        Sylvain Lebresne added a comment -

        You mean just as a javadoc comment?

        Yes.

        Show
        Sylvain Lebresne added a comment - You mean just as a javadoc comment? Yes.
        Hide
        Brandon Williams added a comment -

        I used commit 4629648899e637e8e03938935f126689cce5ad48 and applied the 2843_c.patch, and also tried the head, but got the following error with the benchmark pycassa script. how did you succeed with it?

        [default@query] create column family NoCache
        ... with comparator = AsciiType
        ... and default_validation_class = AsciiType
        ... and key_validation_class = AsciiType
        ... and keys_cached = 0
        ... and rows_cached = 0;
        Unable to set Compaction Strategy Class of AsciiType

        Add "and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'"

        Show
        Brandon Williams added a comment - I used commit 4629648899e637e8e03938935f126689cce5ad48 and applied the 2843_c.patch, and also tried the head, but got the following error with the benchmark pycassa script. how did you succeed with it? [default@query] create column family NoCache ... with comparator = AsciiType ... and default_validation_class = AsciiType ... and key_validation_class = AsciiType ... and keys_cached = 0 ... and rows_cached = 0; Unable to set Compaction Strategy Class of AsciiType Add "and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'"
        Hide
        Sylvain Lebresne added a comment -

        Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure?

        Actually I just realized that there is one place where we do add a column after a read not at the end of the CF. That's for counters, after the read for replication and in the case where we can shrink the context because the node has renewed its NodeId multiple times and we can merge the old ones together. In that case, we end up updating some of the columns of the column family we've just read. Note that this code won't even be executed 99.999% of the time, and even then only a handful of columns are likely to be updated, so using the AL implementation really is the best choice.
        We could, if we really want to, add special code for that specific case (copying the CF read into a CLSM backed one typically before updating it). That would be less efficient but that probably doesn't matter in that specific case. But more importantly, that exemplify why I think using an assertion is more dangerous than it needs to be. Imho, the bug we would have had is the kind that could likely made it into a release.

        Show
        Sylvain Lebresne added a comment - Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure? Actually I just realized that there is one place where we do add a column after a read not at the end of the CF. That's for counters, after the read for replication and in the case where we can shrink the context because the node has renewed its NodeId multiple times and we can merge the old ones together. In that case, we end up updating some of the columns of the column family we've just read. Note that this code won't even be executed 99.999% of the time, and even then only a handful of columns are likely to be updated, so using the AL implementation really is the best choice. We could, if we really want to, add special code for that specific case (copying the CF read into a CLSM backed one typically before updating it). That would be less efficient but that probably doesn't matter in that specific case. But more importantly, that exemplify why I think using an assertion is more dangerous than it needs to be. Imho, the bug we would have had is the kind that could likely made it into a release.
        Hide
        Yang Yang added a comment -

        I did some performance testing using Aaron's script here: http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and overall in the 95th percentile there was an approximate 10% gain across the board.

        I looked at Aaron's script, it actually returns 100 columns on each get. since the column count filtering happens in the memtable iterator before the collating iterator and ColumnMap.add(), the advantage of this patch is not fully shown (only 10% ). I added a simple test case to the script to return all columns , in the 10,000 columns case, the time reduction is about 50%. I'm still running the full test, will upload the data later

        Show
        Yang Yang added a comment - I did some performance testing using Aaron's script here: http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and overall in the 95th percentile there was an approximate 10% gain across the board. I looked at Aaron's script, it actually returns 100 columns on each get. since the column count filtering happens in the memtable iterator before the collating iterator and ColumnMap.add(), the advantage of this patch is not fully shown (only 10% ). I added a simple test case to the script to return all columns , in the 10,000 columns case, the time reduction is about 50%. I'm still running the full test, will upload the data later
        Hide
        Yang Yang added a comment -

        standard timing generated by Aaron's scripts. note the last test case in the end

        for returning whole row.

        Row ten-thousand latency in ms 13.22 23.41 23.98 24.23

        Show
        Yang Yang added a comment - standard timing generated by Aaron's scripts. note the last test case in the end for returning whole row. Row ten-thousand latency in ms 13.22 23.41 23.98 24.23
        Hide
        Yang Yang added a comment -

        timing data generated using Aaron's scripts, and patched cassandra server.

        note the test case at the end, for returning whole row

        Row ten-thousand latency in ms 7.325 14.5 14.59 14.67

        Show
        Yang Yang added a comment - timing data generated using Aaron's scripts, and patched cassandra server. note the test case at the end, for returning whole row Row ten-thousand latency in ms 7.325 14.5 14.59 14.67
        Hide
        Jonathan Ellis added a comment -

        Note that this code won't even be executed 99.999% of the time

        Is there a reason we can't a test for that path?

        that exemplify why I think using an assertion is more dangerous than it needs to be

        I take the opposite lesson away – what if something (say HH for example) added thousands of unsorted columns for a reason we hadn't thought of? We'd probably go months before someone realized why we had this performance regression. But the assert means we'd know the cause immediately, the user who hit it can disable -ea until we have a fix shortly afterwards.

        I'd be okay with an addUnsorted method or something if we want to say "let's use the AL even though it's not sorted, I know what I'm doing."

        Show
        Jonathan Ellis added a comment - Note that this code won't even be executed 99.999% of the time Is there a reason we can't a test for that path? that exemplify why I think using an assertion is more dangerous than it needs to be I take the opposite lesson away – what if something (say HH for example) added thousands of unsorted columns for a reason we hadn't thought of? We'd probably go months before someone realized why we had this performance regression. But the assert means we'd know the cause immediately, the user who hit it can disable -ea until we have a fix shortly afterwards. I'd be okay with an addUnsorted method or something if we want to say "let's use the AL even though it's not sorted, I know what I'm doing."
        Hide
        Yang Yang added a comment -

        the DeletionInfo private=>protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937

        new patch uploaded here

        Show
        Yang Yang added a comment - the DeletionInfo private=>protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here
        Hide
        Sylvain Lebresne added a comment -

        Attaching rebased and update patch.

        This patch fixes reversed slices. Turns out previous patch wasn't handling them correctly. This is also a little bit more annoying to do that one would hope because the code assumes that the ColumnFamily object is itself sorted in forward sorted order but the insertion are made in reverse sorted order (changing that is much more involved than it appears unfortunately), so it's not as simple as feeding a reverse comparator to the map. So the patch takes an "hint" at the construction of the ColumnFamily, indicating how the insert will be done, but it does not influence the sorting of the cf itself. Internally, AL keeps the elements in reverse sorted order as one could expect, but its iterator still have to return elements in sorted order, so there is a bit of boilerplate involved.

        Actually I just realized that there is one place where we do add a column after a read not at the end of the CF.

        I was a little bit quick on that, the aforementioned place actually uses addAll(). There is no real efficiency problem with addAll() however (it does a merge). So the patch actually adds an assert in add() that the added column is in sorted order.
        I found another place though where we did do a add() not in sorted order. This is due to a case where we actually want to replace a column, and we end up removing then adding. I've added a replace method to handle that case instead since anyway it's "the right thing to do". Note that this example was actually on the compaction path (for counters), not the read one, but it feels better to fix it now as it is related.

        Is there a reason we can't a test for that path?

        Not really. I've attached such a test to CASSANDRA-2945.

        The patch also fix the code style in the added test, add a few more ones and add a bunch of comments.

        All the tests (unit and system) are passing (that is, excluding the unrelated ones that happens to fail in trunk right now).

        Show
        Sylvain Lebresne added a comment - Attaching rebased and update patch. This patch fixes reversed slices. Turns out previous patch wasn't handling them correctly. This is also a little bit more annoying to do that one would hope because the code assumes that the ColumnFamily object is itself sorted in forward sorted order but the insertion are made in reverse sorted order (changing that is much more involved than it appears unfortunately), so it's not as simple as feeding a reverse comparator to the map. So the patch takes an "hint" at the construction of the ColumnFamily, indicating how the insert will be done, but it does not influence the sorting of the cf itself. Internally, AL keeps the elements in reverse sorted order as one could expect, but its iterator still have to return elements in sorted order, so there is a bit of boilerplate involved. Actually I just realized that there is one place where we do add a column after a read not at the end of the CF. I was a little bit quick on that, the aforementioned place actually uses addAll(). There is no real efficiency problem with addAll() however (it does a merge). So the patch actually adds an assert in add() that the added column is in sorted order. I found another place though where we did do a add() not in sorted order. This is due to a case where we actually want to replace a column, and we end up removing then adding. I've added a replace method to handle that case instead since anyway it's "the right thing to do". Note that this example was actually on the compaction path (for counters), not the read one, but it feels better to fix it now as it is related. Is there a reason we can't a test for that path? Not really. I've attached such a test to CASSANDRA-2945 . The patch also fix the code style in the added test, add a few more ones and add a bunch of comments. All the tests (unit and system) are passing (that is, excluding the unrelated ones that happens to fail in trunk right now).
        Hide
        Sylvain Lebresne added a comment -

        Forgot a few goodies in my previous attach. Previous patch was only using AL for local read, new patch use it when deserializing cf in ReadResponse. This should amount for some more speedup on read with more than one node. There is probably a few other places where AL makes sense (compaction clearly comes to mind), but it's a little more involved so let's focus on the read path for this one.

        Show
        Sylvain Lebresne added a comment - Forgot a few goodies in my previous attach. Previous patch was only using AL for local read, new patch use it when deserializing cf in ReadResponse. This should amount for some more speedup on read with more than one node. There is probably a few other places where AL makes sense (compaction clearly comes to mind), but it's a little more involved so let's focus on the read path for this one.
        Hide
        Yang Yang added a comment -

        there seems to be a bug somewhere regarding deleting:

        the tests in cassandra src work fine, but when I compiled hector using the newly-built cassandra, hector tests show such errors:

        somehow in the read path 2 threads are accessing the same CF (seems only related to SuperColumn though)

        Fatal exception in thread Thread[ReadStage:1,5,main]
        java.lang.RuntimeException: java.util.ConcurrentModificationException
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
        Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:782)
        at java.util.ArrayList$Itr.next(ArrayList.java:754)
        at org.apache.cassandra.db.ColumnFamilyStore.removeDeletedSuper(ColumnFamilyStore.java:864)
        at org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFamilyStore.java:838)
        at org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.java:831)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1190)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1139)
        at org.apache.cassandra.db.Table.getRow(Table.java:377)
        at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58)
        at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:641)
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more

        Show
        Yang Yang added a comment - there seems to be a bug somewhere regarding deleting: the tests in cassandra src work fine, but when I compiled hector using the newly-built cassandra, hector tests show such errors: somehow in the read path 2 threads are accessing the same CF (seems only related to SuperColumn though) Fatal exception in thread Thread [ReadStage:1,5,main] java.lang.RuntimeException: java.util.ConcurrentModificationException at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:782) at java.util.ArrayList$Itr.next(ArrayList.java:754) at org.apache.cassandra.db.ColumnFamilyStore.removeDeletedSuper(ColumnFamilyStore.java:864) at org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFamilyStore.java:838) at org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.java:831) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1190) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1139) at org.apache.cassandra.db.Table.getRow(Table.java:377) at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:641) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more
        Hide
        Yang Yang added a comment - - edited

        ok, the last-reported error is slightly related to this JIRA: it has been there all along, but previously the CSLM map works fine for concurrent iteration+delete, not with AL it's exposed

        a simple fix

        Show
        Yang Yang added a comment - - edited ok, the last-reported error is slightly related to this JIRA: it has been there all along, but previously the CSLM map works fine for concurrent iteration+delete, not with AL it's exposed a simple fix
        Hide
        Sylvain Lebresne added a comment -

        Attaching 2843_g.patch. It is just a rebase with Yang last small fix added.

        Show
        Sylvain Lebresne added a comment - Attaching 2843_g.patch. It is just a rebase with Yang last small fix added.
        Hide
        Jonathan Ellis added a comment -

        Some things I'm not thrilled with:

        • the IColumnMap name when it does not implement Map interface, and some things it has in common with Map (iteration) it changes semantics of (iterating values instead of keys). not sure what to use instead though, since we already have an IColumnContainer. Maybe ISortedColumns?
        • TSCM and ALCM extending instead of wrapping CSLM/AL, respectively
        • unrelated reformatting

        Otherwise, lgtm

        Show
        Jonathan Ellis added a comment - Some things I'm not thrilled with: the IColumnMap name when it does not implement Map interface, and some things it has in common with Map (iteration) it changes semantics of (iterating values instead of keys). not sure what to use instead though, since we already have an IColumnContainer. Maybe ISortedColumns? TSCM and ALCM extending instead of wrapping CSLM/AL, respectively unrelated reformatting Otherwise, lgtm
        Hide
        Sylvain Lebresne added a comment -

        the IColumnMap name when it does not implement Map interface, and some things it has in common with Map (iteration) it changes semantics of (iterating values instead of keys). not sure what to use instead though, since we already have an IColumnContainer. Maybe ISortedColumns?

        Yeah, I'm not sure I have a better name either, maybe ISortedColumnHolder, but not sure it's better than ISortedColumns so attached rebased patch simply rename ColumnMap -> SortedColumns

        TSCM and ALCM extending instead of wrapping CSLM/AL, respectively

        The idea was to save one object creation. I admit this is probably not a huge deal, but it felt that in this case it was no big deal to extend instead of wrapping either, so felt like worth "optimizing". I still stand by that choice but I have no good argument against the criticism that it is possibly premature.

        unrelated reformatting

        If we're talking about the ones in SuperColumn.java, sorry, I mistakenly forced re-indentation on the file which rewrote the tab to spaces. New patch keeps the old formatting. I'd mention that there is also a few places where I've rewrote cf.getSortedColumns().iterator() to cf.iterator(), which is arguably a bit gratuitous for this patch, but I figured this avoids creating a new Collection in the case of CLSM and there's not so many occurrences.

        Show
        Sylvain Lebresne added a comment - the IColumnMap name when it does not implement Map interface, and some things it has in common with Map (iteration) it changes semantics of (iterating values instead of keys). not sure what to use instead though, since we already have an IColumnContainer. Maybe ISortedColumns? Yeah, I'm not sure I have a better name either, maybe ISortedColumnHolder, but not sure it's better than ISortedColumns so attached rebased patch simply rename ColumnMap -> SortedColumns TSCM and ALCM extending instead of wrapping CSLM/AL, respectively The idea was to save one object creation. I admit this is probably not a huge deal, but it felt that in this case it was no big deal to extend instead of wrapping either, so felt like worth "optimizing". I still stand by that choice but I have no good argument against the criticism that it is possibly premature. unrelated reformatting If we're talking about the ones in SuperColumn.java, sorry, I mistakenly forced re-indentation on the file which rewrote the tab to spaces. New patch keeps the old formatting. I'd mention that there is also a few places where I've rewrote cf.getSortedColumns().iterator() to cf.iterator(), which is arguably a bit gratuitous for this patch, but I figured this avoids creating a new Collection in the case of CLSM and there's not so many occurrences.
        Hide
        Jonathan Ellis added a comment -

        +1

        Show
        Jonathan Ellis added a comment - +1
        Hide
        Sylvain Lebresne added a comment -

        Committed, thanks

        Show
        Sylvain Lebresne added a comment - Committed, thanks
        Hide
        Hudson added a comment -

        Integrated in Cassandra #1010 (See https://builds.apache.org/job/Cassandra/1010/)
        Make ColumnFamily backing column map pluggable and introduce unsynchronized ArrayList backed map for reads
        patch by slebresne; reviewed by jbellis for CASSANDRA-2843

        slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1155426
        Files :

        • /cassandra/trunk/src/java/org/apache/cassandra/db/filter/IFilter.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/filter/QueryFilter.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/SuperColumn.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/filter/NamesQueryFilter.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java
        • /cassandra/trunk/src/java/org/apache/cassandra/service/RowRepairResolver.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/IColumnContainer.java
        • /cassandra/trunk/test/unit/org/apache/cassandra/db/ArrayBackedSortedColumnsTest.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ArrayBackedSortedColumns.java
        • /cassandra/trunk/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ThreadSafeSortedColumns.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
        • /cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/filter/SliceQueryFilter.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ISortedColumns.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ReadResponse.java
        • /cassandra/trunk/CHANGES.txt
        • /cassandra/trunk/test/unit/org/apache/cassandra/db/RowTest.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/Row.java
        • /cassandra/trunk/test/unit/org/apache/cassandra/streaming/StreamingTransferTest.java
        • /cassandra/trunk/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java
        • /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java
        Show
        Hudson added a comment - Integrated in Cassandra #1010 (See https://builds.apache.org/job/Cassandra/1010/ ) Make ColumnFamily backing column map pluggable and introduce unsynchronized ArrayList backed map for reads patch by slebresne; reviewed by jbellis for CASSANDRA-2843 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1155426 Files : /cassandra/trunk/src/java/org/apache/cassandra/db/filter/IFilter.java /cassandra/trunk/src/java/org/apache/cassandra/db/filter/QueryFilter.java /cassandra/trunk/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java /cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java /cassandra/trunk/src/java/org/apache/cassandra/db/SuperColumn.java /cassandra/trunk/src/java/org/apache/cassandra/db/filter/NamesQueryFilter.java /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java /cassandra/trunk/src/java/org/apache/cassandra/service/RowRepairResolver.java /cassandra/trunk/src/java/org/apache/cassandra/db/IColumnContainer.java /cassandra/trunk/test/unit/org/apache/cassandra/db/ArrayBackedSortedColumnsTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/ArrayBackedSortedColumns.java /cassandra/trunk/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java /cassandra/trunk/src/java/org/apache/cassandra/db/ThreadSafeSortedColumns.java /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java /cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java /cassandra/trunk/src/java/org/apache/cassandra/db/filter/SliceQueryFilter.java /cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java /cassandra/trunk/src/java/org/apache/cassandra/db/ISortedColumns.java /cassandra/trunk/src/java/org/apache/cassandra/db/ReadResponse.java /cassandra/trunk/CHANGES.txt /cassandra/trunk/test/unit/org/apache/cassandra/db/RowTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/Row.java /cassandra/trunk/test/unit/org/apache/cassandra/streaming/StreamingTransferTest.java /cassandra/trunk/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java

          People

          • Assignee:
            Sylvain Lebresne
            Reporter:
            Yang Yang
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development