Apache Gora
  1. Apache Gora
  2. GORA-22

Upgrade cassandra backend to cassandra 0.7

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2
    • Component/s: storage
    • Labels:
      None
    1. cassandraHashmap.patch
      2 kB
      Alexis
    2. gora-cassandra-mapping.xml
      2 kB
      Alexis
    3. goraCassandra.patch
      107 kB
      Alexis
    4. gora-cassandra-mapping.xml
      3 kB
      Alexis
    5. gora.patch
      99 kB
      Alexis

      Issue Links

        Activity

        Hide
        Alexis added a comment -

        The patch ports gora-cassandra to Cassandra 0.7. It uses Hector as the Cassandra client.

        The necessary dependencies are:

        • apache-cassandra 0.7
        • libthrift 0.5
        • hector 0.7.0-23
        • perf4j 0.9.14
        • high-scale-lib 1.0

        These can be found in default Maven repository but I'm not sure how to include them in Ivy.

        Show
        Alexis added a comment - The patch ports gora-cassandra to Cassandra 0.7. It uses Hector as the Cassandra client. The necessary dependencies are: apache-cassandra 0.7 libthrift 0.5 hector 0.7.0-23 perf4j 0.9.14 high-scale-lib 1.0 These can be found in default Maven repository but I'm not sure how to include them in Ivy.
        Hide
        Enis Soztutar added a comment -

        You can add the dependencies to ivy.xml just like the other dependencies. mvnrepository.org also lists ivy code to add the dependencies. ivy checks to maven central repository.

        Show
        Enis Soztutar added a comment - You can add the dependencies to ivy.xml just like the other dependencies. mvnrepository.org also lists ivy code to add the dependencies. ivy checks to maven central repository.
        Hide
        Julien Nioche added a comment -

        I think Cassandra 0.7 allows to define the schema via the API, whereas it had to be done manually before. Alexis, does your patch support this? This will be a nice feature.
        Enis : can you describe what the problem is with the current implementation of the Cassandra backend? I remember it was related to multi-threading but it would great if you could give more details so that we can be sure that the new implementation does not suffer from the same issues.

        Show
        Julien Nioche added a comment - I think Cassandra 0.7 allows to define the schema via the API, whereas it had to be done manually before. Alexis, does your patch support this? This will be a nice feature. Enis : can you describe what the problem is with the current implementation of the Cassandra backend? I remember it was related to multi-threading but it would great if you could give more details so that we can be sure that the new implementation does not suffer from the same issues.
        Hide
        Alexis added a comment -

        Dear Julien,

        Currently, you have to manually create the schema via the Cassandra shell. See http://techvineyard.blogspot.com/2010/12/build-nutch-20.html#Cassandra

        It might be feasible though: https://github.com/zznate/hector-examples/blob/master/src/main/java/com/riptano/cassandra/hector/example/SchemaManipulation.java

        Dear Enis,

        I'm struggling with Ivy! This is the Ivy tag suggested by http://mvnrepository.com/artifact/com.github.stephenc.hector/hector/0.7.0-rc4-1

        <dependency org="com.github.stephenc.hector" name="hector" rev="0.7.0-rc4-1" />

        When compiling the gora-cassandra module with ant+ivy, it does not seem to be copied to gora-cassandra/lib directory:

        compile:
        [javac] Compiling 10 source files to /home/alex/java/workspace/Gora/gora-cassandra/build/classes
        [javac] /home/alex/java/workspace/Gora/gora-cassandra/src/main/java/org/apache/gora/cassandra/query/CassandraSubColumn.java:9: package me.prettyprint.hector.api.beans does not exist
        [javac] import me.prettyprint.hector.api.beans.HColumn;

        The maven dependency works fine in Eclipse

        <dependency>
        <groupId>me.prettyprint</groupId>
        <artifactId>hector</artifactId>
        <version>0.7.0-23</version>
        <type>pom</type>
        <scope>compile</scope>
        </dependency>

        and the jar is downloaded to ~/m2/repository/me/prettyprint/hector/0.7.0-20

        Show
        Alexis added a comment - Dear Julien, Currently, you have to manually create the schema via the Cassandra shell. See http://techvineyard.blogspot.com/2010/12/build-nutch-20.html#Cassandra It might be feasible though: https://github.com/zznate/hector-examples/blob/master/src/main/java/com/riptano/cassandra/hector/example/SchemaManipulation.java Dear Enis, I'm struggling with Ivy! This is the Ivy tag suggested by http://mvnrepository.com/artifact/com.github.stephenc.hector/hector/0.7.0-rc4-1 <dependency org="com.github.stephenc.hector" name="hector" rev="0.7.0-rc4-1" /> When compiling the gora-cassandra module with ant+ivy, it does not seem to be copied to gora-cassandra/lib directory: compile: [javac] Compiling 10 source files to /home/alex/java/workspace/Gora/gora-cassandra/build/classes [javac] /home/alex/java/workspace/Gora/gora-cassandra/src/main/java/org/apache/gora/cassandra/query/CassandraSubColumn.java:9: package me.prettyprint.hector.api.beans does not exist [javac] import me.prettyprint.hector.api.beans.HColumn; The maven dependency works fine in Eclipse <dependency> <groupId>me.prettyprint</groupId> <artifactId>hector</artifactId> <version>0.7.0-23</version> <type>pom</type> <scope>compile</scope> </dependency> and the jar is downloaded to ~/m2/repository/me/prettyprint/hector/0.7.0-20
        Hide
        Enis Soztutar added a comment -

        Alexis, I think the issue is that you use different versions of the jar in maven and ivy. As you can see even the groupid's are different. The corresponding tags in ivy are groupId => org, artifactId => name, version => rev, and scope=> conf. You can use every maven dependency in exactly the same configuration in ivy.

        so if the above maven dependency works, you can try
        <dependency org="me.prettyprint" name="hector" rev="0.7.0-23" />

        Julien, the issue with current Cassandra client is that multithreaded execution is causing some weird exceptions, but we weren't able to dig down much further. Maybe with this patch we can test more.

        Show
        Enis Soztutar added a comment - Alexis, I think the issue is that you use different versions of the jar in maven and ivy. As you can see even the groupid's are different. The corresponding tags in ivy are groupId => org, artifactId => name, version => rev, and scope=> conf. You can use every maven dependency in exactly the same configuration in ivy. so if the above maven dependency works, you can try <dependency org="me.prettyprint" name="hector" rev="0.7.0-23" /> Julien, the issue with current Cassandra client is that multithreaded execution is causing some weird exceptions, but we weren't able to dig down much further. Maybe with this patch we can test more.
        Hide
        Alexis added a comment -

        Please see change set here: http://svn.apache.org/viewvc?view=revision&revision=1149420

        • The Gora backend is compatible with Cassandra 0.8. It uses Hector 0.8 client to commit changes to the NoSQL database.
        • The schema is automatically created when the datastore object is loaded, if the keyspace does not already exist.

        See new patch and new Gora Mapping configuration. The XML configuration shows the supported format that allows you to define your schema, ie define the persistency of your Avro objects into Cassandra column/supercolumns.

        Show
        Alexis added a comment - Please see change set here: http://svn.apache.org/viewvc?view=revision&revision=1149420 The Gora backend is compatible with Cassandra 0.8. It uses Hector 0.8 client to commit changes to the NoSQL database. The schema is automatically created when the datastore object is loaded, if the keyspace does not already exist. See new patch and new Gora Mapping configuration. The XML configuration shows the supported format that allows you to define your schema, ie define the persistency of your Avro objects into Cassandra column/supercolumns.
        Hide
        Julien Nioche added a comment -

        Alexis, it is a commonly accepted practice at Apache that patches are peer reviewed before being committed. I don't think anyone has had the time to look at your code before you replaced the existing version with yours and I think it is definitely wrong. Please refrain from doing so in the future, thanks.

        Show
        Julien Nioche added a comment - Alexis, it is a commonly accepted practice at Apache that patches are peer reviewed before being committed. I don't think anyone has had the time to look at your code before you replaced the existing version with yours and I think it is definitely wrong. Please refrain from doing so in the future, thanks.
        Hide
        Chris A. Mattmann added a comment -

        Alexis, it is a commonly accepted practice at Apache that patches are peer reviewed before being committed

        That's not strictly true. It depends on what policy the project is using. Apache supports both Commit-then-Review (CTR) as well as Review-then-Commit (RTC) models. Each has their own advantages and outcomes:

        http://www.apache.org/foundation/glossary.html#CommitThenReview
        http://www.apache.org/foundation/glossary.html#ReviewThenCommit

        My personal rule of thumb is that if I'm doing some quick, easy bug fux, or some new feature, etc., then simply committing it, then asking for feedback is fine. As devs/committers, we have been granted the commit bit, and we're using SVN, so things can change as needed. On the other hand, if I'm doing something larger, and I want feedback from folks before I put it in SVN (maybe I anticipate not having time to modify it much after that point, or I just feel that getting it right first really matters), then I'll enter into RTC mode and ask for feedback via patches.

        Either way is fine, really.

        Show
        Chris A. Mattmann added a comment - Alexis, it is a commonly accepted practice at Apache that patches are peer reviewed before being committed That's not strictly true. It depends on what policy the project is using. Apache supports both Commit-then-Review (CTR) as well as Review-then-Commit (RTC) models. Each has their own advantages and outcomes: http://www.apache.org/foundation/glossary.html#CommitThenReview http://www.apache.org/foundation/glossary.html#ReviewThenCommit My personal rule of thumb is that if I'm doing some quick, easy bug fux, or some new feature, etc., then simply committing it, then asking for feedback is fine. As devs/committers, we have been granted the commit bit, and we're using SVN, so things can change as needed. On the other hand, if I'm doing something larger, and I want feedback from folks before I put it in SVN (maybe I anticipate not having time to modify it much after that point, or I just feel that getting it right first really matters), then I'll enter into RTC mode and ask for feedback via patches. Either way is fine, really.
        Hide
        Chris A. Mattmann added a comment -

        One note though: Alexis, it seems like you've replaced and/or added upon somehow to the cassandra backend. It was my impression we had one already or that it was functioning/etc. What does your patch do that improves or adds to it? Would be great to explain for the benefit of others watching.

        Show
        Chris A. Mattmann added a comment - One note though: Alexis, it seems like you've replaced and/or added upon somehow to the cassandra backend. It was my impression we had one already or that it was functioning/etc. What does your patch do that improves or adds to it? Would be great to explain for the benefit of others watching.
        Hide
        Alexis added a comment -

        "Solves" the multithreaded issue with Gora Cassandra.

        Show
        Alexis added a comment - "Solves" the multithreaded issue with Gora Cassandra.
        Hide
        Lewis John McGibbney added a comment -

        Hoping to come round to this in due course. I will attempt to test and will post results in due course. Thank you

        Show
        Lewis John McGibbney added a comment - Hoping to come round to this in due course. I will attempt to test and will post results in due course. Thank you
        Hide
        Chris A. Mattmann added a comment -
        • push to 0.3
        Show
        Chris A. Mattmann added a comment - push to 0.3
        Hide
        Chris A. Mattmann added a comment -
        • schedule
        Show
        Chris A. Mattmann added a comment - schedule
        Hide
        Lewis John McGibbney added a comment -

        As this is still open, I can't help but think that we should be somewhere closer to the 1.0 release of Cassandra which was recently announced. Does anyone have any opinions on where this ticket is going? Also as far as I can see, the mulithreaded issue should now be resolved as Alexis' commit still stands strong.

        Show
        Lewis John McGibbney added a comment - As this is still open, I can't help but think that we should be somewhere closer to the 1.0 release of Cassandra which was recently announced. Does anyone have any opinions on where this ticket is going? Also as far as I can see, the mulithreaded issue should now be resolved as Alexis' commit still stands strong.
        Hide
        Lewis John McGibbney added a comment -

        Hi. I have set this issue to block further progression on the gora-cassandra module e.g upgrading to 1.0.2, writing tests for the full module documentation etc. It would be great if we could bottom this one out. Any comments?

        Show
        Lewis John McGibbney added a comment - Hi. I have set this issue to block further progression on the gora-cassandra module e.g upgrading to 1.0.2, writing tests for the full module documentation etc. It would be great if we could bottom this one out. Any comments?
        Hide
        Lewis John McGibbney added a comment -

        Hi Guys, feels a bit like ground hog day here :0) It would be great if someone that has been working on this issue could review and close if this has been sorted out. Thank you

        Show
        Lewis John McGibbney added a comment - Hi Guys, feels a bit like ground hog day here :0) It would be great if someone that has been working on this issue could review and close if this has been sorted out. Thank you
        Hide
        Henry Saputra added a comment -

        The code is checked in by Alexis AFAIK. Need to check svn history to confirm this.

        Show
        Henry Saputra added a comment - The code is checked in by Alexis AFAIK. Need to check svn history to confirm this.
        Hide
        Lewis John McGibbney added a comment -

        Hi Henry. Yeah, if you have a look here [1] you will see that he definitely checked it in. I just didn't think that it was my duty to close this one off before we had agreed that the initial issue highlighted by Julien had been addressed.

        [1] http://svn.apache.org/viewvc/incubator/gora/trunk/gora-cassandra/

        Show
        Lewis John McGibbney added a comment - Hi Henry. Yeah, if you have a look here [1] you will see that he definitely checked it in. I just didn't think that it was my duty to close this one off before we had agreed that the initial issue highlighted by Julien had been addressed. [1] http://svn.apache.org/viewvc/incubator/gora/trunk/gora-cassandra/
        Hide
        Lewis John McGibbney added a comment -

        We've been aware for sometime that this code has been checked in, in addition there have been no objections to closing the issue off. I'll add Julien & Alexis creds to CHANGES.txt and duly close this one out.

        Show
        Lewis John McGibbney added a comment - We've been aware for sometime that this code has been checked in, in addition there have been no objections to closing the issue off. I'll add Julien & Alexis creds to CHANGES.txt and duly close this one out.
        Hide
        Lewis John McGibbney added a comment -

        I am against closing issues that are not mine (sorry Julien), but I'm firmly of the opinion that this old dog needs to be put to bed.

        Show
        Lewis John McGibbney added a comment - I am against closing issues that are not mine (sorry Julien), but I'm firmly of the opinion that this old dog needs to be put to bed.

          People

          • Assignee:
            Unassigned
            Reporter:
            Julien Nioche
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development