Cassandra
  1. Cassandra
  2. CASSANDRA-5078

save compaction merge counts in a system table

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 2.0.2
    • Component/s: None
    • Labels:

      Description

      we should save the compaction merge stats from CASSANDRA-4894 in the system table and probably expose them via JMX (and nodetool)

      1. 5078-v3.txt
        19 kB
        Jonathan Ellis
      2. 5078-v4.txt
        21 kB
        lantao yan
      3. 5078-v5.patch
        19 kB
        lantao yan
      4. patch1.patch
        29 kB
        lantao yan

        Activity

        Hide
        lantao yan added a comment -

        which system table should the compaction data save to? is there any sample code showing saving stuff to system table?

        Show
        lantao yan added a comment - which system table should the compaction data save to? is there any sample code showing saving stuff to system table?
        Hide
        Jonathan Ellis added a comment -

        I think you'd want a new table, something like

        CREATE TABLE compaction_history (
            text keyspace,
            text columnfamily,
            compacted_at timestamp,
            bytes_in long,
            bytes_out long,
            rows_merged Map<int, long>,
            PRIMARY KEY (keyspace, columnfamily, compacted_at)
        );
        

        Look at CFMetadata for where the system tables are created, and SystemKeyspace for loading/storing examples.

        Show
        Jonathan Ellis added a comment - I think you'd want a new table, something like CREATE TABLE compaction_history ( text keyspace, text columnfamily, compacted_at timestamp, bytes_in long , bytes_out long , rows_merged Map< int , long >, PRIMARY KEY (keyspace, columnfamily, compacted_at) ); Look at CFMetadata for where the system tables are created, and SystemKeyspace for loading/storing examples.
        Hide
        Jonathan Ellis added a comment -

        Are you planning to take a stab at this, lantao yan?

        Show
        Jonathan Ellis added a comment - Are you planning to take a stab at this, lantao yan ?
        Hide
        lantao yan added a comment - - edited

        yes. I have finish the coding to support the JMX and nodetool. everything looks fine, except when running three tests, the JVM crashes. I am looking into the tests for now. Is it an urgent task? cause this is my first time to touch Cassandra, I think it will take me time to fix the JVM crash bugs. I am guessing it has something to do with the temp file "system-compaction_history-tmp-jb-14-Statistics.db". if you have encountered same issue, please let me know.

        Show
        lantao yan added a comment - - edited yes. I have finish the coding to support the JMX and nodetool. everything looks fine, except when running three tests, the JVM crashes. I am looking into the tests for now. Is it an urgent task? cause this is my first time to touch Cassandra, I think it will take me time to fix the JVM crash bugs. I am guessing it has something to do with the temp file "system-compaction_history-tmp-jb-14-Statistics.db". if you have encountered same issue, please let me know.
        Hide
        Jonathan Ellis added a comment -

        Are you running on Windows? There are known problems w/ the test suite and tmp files there.

        Show
        Jonathan Ellis added a comment - Are you running on Windows? There are known problems w/ the test suite and tmp files there.
        Hide
        lantao yan added a comment -

        unfortunately, I am using MAC. I run all tests on trunk, it is working fine; but after running against my code, JVM cashes. I am guessing, there is something wrong when some thread try to delete the temp file.

        Show
        lantao yan added a comment - unfortunately, I am using MAC. I run all tests on trunk, it is working fine; but after running against my code, JVM cashes. I am guessing, there is something wrong when some thread try to delete the temp file.
        Hide
        Brandon Williams added a comment -

        which JVM and version are you using? I would say just post the patch and we'll give it a try.

        Show
        Brandon Williams added a comment - which JVM and version are you using? I would say just post the patch and we'll give it a try.
        Hide
        lantao yan added a comment -

        i am using oracle JVM 7. How do I post the patch? push to a new branch?

        Show
        lantao yan added a comment - i am using oracle JVM 7. How do I post the patch? push to a new branch?
        Hide
        Brandon Williams added a comment -

        You can attach a diff in jira (more actions->attach files) or push a branch somewhere like your github, then post a comment with the link here.

        Show
        Brandon Williams added a comment - You can attach a diff in jira (more actions->attach files) or push a branch somewhere like your github, then post a comment with the link here.
        Hide
        lantao yan added a comment -

        done. please check the attachment. one failing test is RecoveryManager3Test.

        Show
        lantao yan added a comment - done. please check the attachment. one failing test is RecoveryManager3Test.
        Hide
        Brandon Williams added a comment -

        This diff doesn't apply cleanly for me.

        Show
        Brandon Williams added a comment - This diff doesn't apply cleanly for me.
        Hide
        lantao yan added a comment - - edited

        i attached a new patch file between my branch and the latest trunk. this one should be working. i have tried on my local machine.

        Show
        lantao yan added a comment - - edited i attached a new patch file between my branch and the latest trunk. this one should be working. i have tried on my local machine.
        Hide
        lantao yan added a comment - - edited

        @Jonathan Ellis, do we save the compaction history of column families under system keyspace? or just customer keyspaces?

        Show
        lantao yan added a comment - - edited @Jonathan Ellis, do we save the compaction history of column families under system keyspace? or just customer keyspaces?
        Hide
        Jonathan Ellis added a comment -

        We should probably exclude compaction_history itself, but including the other system tables is a good idea.

        Other comments:

        Show
        Jonathan Ellis added a comment - We should probably exclude compaction_history itself, but including the other system tables is a good idea. Other comments: Use CompositeData to expose this to JMX instead of formatted strings ( https://github.com/apache/cassandra/tree/cassandra-2.0/src/java/org/apache/cassandra/streaming/management ) Please follow the style guide at http://wiki.apache.org/cassandra/CodeStyle In general, don't extract blocks of code that are only called once into separate methods
        Hide
        Jonathan Ellis added a comment -
        Show
        Jonathan Ellis added a comment - Use CompositeData Yuki suggests http://docs.oracle.com/javase/7/docs/api/javax/management/openmbean/TabularData.html may be a better fit.
        Hide
        Jonathan Ellis added a comment -

        Fixed formatting and re-allowed compaction of _history table, while avoiding writing stats for it.

        Rest LGTM, over to Yuki on the TabularDataSupport business.

        Show
        Jonathan Ellis added a comment - Fixed formatting and re-allowed compaction of _history table, while avoiding writing stats for it. Rest LGTM, over to Yuki on the TabularDataSupport business.
        Hide
        Yuki Morishita added a comment -

        Couple of comments around JMX:

        BTW, what is the step to remove unnecessary compaction history? Is `truncate system.compaction_history` the right way to do?

        Show
        Yuki Morishita added a comment - Couple of comments around JMX: Methods should return TabularData interface instead of TabularDataSupport I'd rather create class to hold CompositeType/CompositeData like streaming does( https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/streaming/management/StreamStateCompositeData.java ). That way, we don't need to construct type object every time we invoke. Cql rows to tabular data seems to be able to generalize for future use, but I don't think we need that at this time. BTW, what is the step to remove unnecessary compaction history? Is `truncate system.compaction_history` the right way to do?
        Hide
        Jonathan Ellis added a comment -

        Should we also TTL it?

        Show
        Jonathan Ellis added a comment - Should we also TTL it?
        Hide
        lantao yan added a comment -

        very good point for the second one. do i update it or somebody else will do it?

        Show
        lantao yan added a comment - very good point for the second one. do i update it or somebody else will do it?
        Hide
        Jonathan Ellis added a comment -

        back to you

        Show
        Jonathan Ellis added a comment - back to you
        Hide
        lantao yan added a comment -

        I have moved the code to a separate class to make sure type information only be construct once.
        any example to ttl a system cf?

        Show
        lantao yan added a comment - I have moved the code to a separate class to make sure type information only be construct once. any example to ttl a system cf?
        Hide
        Jonathan Ellis added a comment -

        suggest setting a default ttl (CASSANDRA-3974) in cfmetadata creation

        Show
        Jonathan Ellis added a comment - suggest setting a default ttl ( CASSANDRA-3974 ) in cfmetadata creation
        Hide
        lantao yan added a comment - - edited

        I went through the feature 3974, I don't quite understand what I should do to add the ttl property.
        is the change by updating the cql to "UPDATE system.transaction USING TTL DEFAULT_VALUE" in class SystemKeyspace a correct fix?

        Show
        lantao yan added a comment - - edited I went through the feature 3974, I don't quite understand what I should do to add the ttl property. is the change by updating the cql to "UPDATE system.transaction USING TTL DEFAULT_VALUE" in class SystemKeyspace a correct fix?
        Hide
        Jonathan Ellis added a comment -

        CREATE TABLE ... WITH default_time_to_live=604800;

        Show
        Jonathan Ellis added a comment - CREATE TABLE ... WITH default_time_to_live=604800;
        Hide
        lantao yan added a comment -

        i attached a new patch. Please review it. Thanks.

        Show
        lantao yan added a comment - i attached a new patch. Please review it. Thanks.
        Hide
        Yuki Morishita added a comment -

        Thanks for the update.
        One more thing that worries me is we have a chance to have conflict on the entry, since we can have same timestamp among compaction threads.
        I suggest using just one timeuuid as primary key. Thoughts?

        Show
        Yuki Morishita added a comment - Thanks for the update. One more thing that worries me is we have a chance to have conflict on the entry, since we can have same timestamp among compaction threads. I suggest using just one timeuuid as primary key. Thoughts?
        Hide
        lantao yan added a comment -

        sorry for the late update. pretty busy recently. even missed Jonathan's London meetup.
        For the conflict on the entry, I don't quite get it. I read the code, it seems like one directory (one column family) will be owned by one thread? so there won't be any chances, two threads are working on the same column family with the same timestamp? I could be wrong. If I miss something, please let me know.

        Show
        lantao yan added a comment - sorry for the late update. pretty busy recently. even missed Jonathan's London meetup. For the conflict on the entry, I don't quite get it. I read the code, it seems like one directory (one column family) will be owned by one thread? so there won't be any chances, two threads are working on the same column family with the same timestamp? I could be wrong. If I miss something, please let me know.
        Hide
        Yuki Morishita added a comment -

        two threads are working on the same column family with the same timestamp?

        Compactions can run in parallel. See CompactionExecutor in CompactionManager.
        (Number of concurrency is set to number of cpu cores by default.)

        Show
        Yuki Morishita added a comment - two threads are working on the same column family with the same timestamp? Compactions can run in parallel. See CompactionExecutor in CompactionManager. (Number of concurrency is set to number of cpu cores by default.)
        Hide
        lantao yan added a comment -

        yeah, you are right. I thought it was a single thread executor.

        Show
        lantao yan added a comment - yeah, you are right. I thought it was a single thread executor.
        Hide
        Jonathan Ellis added a comment -

        lantao yan, are you still working on this?

        Show
        Jonathan Ellis added a comment - lantao yan , are you still working on this?
        Hide
        lantao yan added a comment - - edited

        last time i checked the trunk, my code is already there. I am wondering what has happened. i will double check that.

        Show
        lantao yan added a comment - - edited last time i checked the trunk, my code is already there. I am wondering what has happened. i will double check that.
        Hide
        Jonathan Ellis added a comment -

        Hmm, if so this ticket did not get mentioned in a commit message or in CHANGES.

        Show
        Jonathan Ellis added a comment - Hmm, if so this ticket did not get mentioned in a commit message or in CHANGES.
        Hide
        lantao yan added a comment - - edited

        sorry, i think i was wrong. I will read the code and consider Yuki's suggestion. it is still a little bit weird to me if two threads can work on the same column family, which is only concern left until now.

        Show
        lantao yan added a comment - - edited sorry, i think i was wrong. I will read the code and consider Yuki's suggestion. it is still a little bit weird to me if two threads can work on the same column family, which is only concern left until now.
        Hide
        lantao yan added a comment -

        Yuki Morishita, Jonathan Ellis, I checked that code, although two threads can work one the same column family. but in case of one thread is compacting, another thread would fail? I was worrying if two threads compacting the same set of files, we could have risk condition? given that, this problem is not actually changing the primary key.

        ColumnFamilyStore.java
        assert data.getCompacting().isEmpty() : data.getCompacting();

        correct me if I am wrong. I will keep reading the code. thanks.

        Show
        lantao yan added a comment - Yuki Morishita , Jonathan Ellis , I checked that code, although two threads can work one the same column family. but in case of one thread is compacting, another thread would fail? I was worrying if two threads compacting the same set of files, we could have risk condition? given that, this problem is not actually changing the primary key. ColumnFamilyStore.java assert data.getCompacting().isEmpty() : data.getCompacting(); correct me if I am wrong. I will keep reading the code. thanks.
        Hide
        Yuki Morishita added a comment -

        Size-tiered/leveled compaction choose files that aren't compacting when determining candidates, and try marking acquired candidates as compacting. If marking fails, then compaction would be skipped(and go on to find next candidates).
        So two or more threads can grab different set of SSTables inside the same ColumnFamily.

        See, for example https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L57 and https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/DataTracker.java#L188.

        Show
        Yuki Morishita added a comment - Size-tiered/leveled compaction choose files that aren't compacting when determining candidates, and try marking acquired candidates as compacting. If marking fails, then compaction would be skipped(and go on to find next candidates). So two or more threads can grab different set of SSTables inside the same ColumnFamily. See, for example https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L57 and https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/DataTracker.java#L188 .
        Hide
        lantao yan added a comment -

        Many thanks for the sharing! yes, I was wrong. Now I have better understanding. as you said, two threads can work on collections of sstables inside the same ks.cf.
        I have no doubt now. One more question, is UUIDGen.getTimeUUID().toString() a correct way to generate the timeuuid as you suggest?

        Show
        lantao yan added a comment - Many thanks for the sharing! yes, I was wrong. Now I have better understanding. as you said, two threads can work on collections of sstables inside the same ks.cf. I have no doubt now. One more question, is UUIDGen.getTimeUUID().toString() a correct way to generate the timeuuid as you suggest?
        Hide
        Jonathan Ellis added a comment -

        Yes, but why turn it into a String?

        Show
        Jonathan Ellis added a comment - Yes, but why turn it into a String?
        Hide
        lantao yan added a comment -

        new patch submitted.

        Show
        lantao yan added a comment - new patch submitted.
        Hide
        Yuki Morishita added a comment -

        Thanks lantao!
        Committed for next 2.0.2 release.

        Show
        Yuki Morishita added a comment - Thanks lantao! Committed for next 2.0.2 release.
        Hide
        lantao yan added a comment -

        Many thanks! So glad to see my contribution!

        Show
        lantao yan added a comment - Many thanks! So glad to see my contribution!

          People

          • Assignee:
            lantao yan
            Reporter:
            Matthew F. Dennis
            Reviewer:
            Yuki Morishita
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development