Tuscany
  1. Tuscany
  2. TUSCANY-3522

[GSoC 2011] Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Future Ideas
    • Component/s: None
    • Labels:

      Description

      NoSQL Datastore component
      =========================

      Write a portable data store component over a number of 'NoSQL' databases (Apache Cassandra, Couchdb, Hadoop/Hbase and AppEngine Datastore databases.)

      This could be one component (written in Python or Java) or a set of components (one per database) all implementing the same REST data store interface, allowing applications to store data in different NoSQL databases without having to worry about the details and API differences between the databases.

      The project could start with just one or two databases and add more databases as we go. This should be a really good opportunity for students to experiment with these new NoSQL databases.

      Resources:

      Tuscany
      http://tuscany.apache.org/

      Cassandra
      http://cassandra.apache.org/

      CouchDB
      http://couchdb.apache.org/

      Hadoop/HBase
      http://hadoop.apache.org/hbase/

      Appengine Datastore
      http://code.google.com/appengine/docs/python/datastore/

      1. rest.patch
        28 kB
        Eranda Sooriyabandara
      2. cassandra.zip
        86 kB
        Eranda Sooriyabandara
      3. cassandra.zip
        83 kB
        Eranda Sooriyabandara
      4. cassandra.zip
        79 kB
        Eranda Sooriyabandara
      5. hbase.fix.patch
        31 kB
        Eranda Sooriyabandara
      6. couchdb.fix.patch
        30 kB
        Eranda Sooriyabandara
      7. cassandra.fix.patch
        27 kB
        Eranda Sooriyabandara
      8. cassandra.fix.patch
        27 kB
        Eranda Sooriyabandara
      9. alheaders.patch
        12 kB
        Eranda Sooriyabandara
      10. HBase-API.patch
        19 kB
        Eranda Sooriyabandara
      11. fix.patch
        15 kB
        Eranda Sooriyabandara
      12. couchdb-api.path
        17 kB
        Eranda Sooriyabandara
      13. rest-api-1.2.patch
        18 kB
        Eranda Sooriyabandara
      14. rest-api.patch
        10 kB
        Eranda Sooriyabandara
      15. twitapp.tar.gz
        5 kB
        Eranda Sooriyabandara

        Activity

        Jean-Sebastien Delfino created issue -
        Jean-Sebastien Delfino made changes -
        Field Original Value New Value
        Labels gsoc gsoc2010 mentor gsoc gsoc2011 mentor
        Jean-Sebastien Delfino made changes -
        Summary Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase [GSoC 2011] Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase
        Hide
        Eranda Sooriyabandara added a comment -

        I am interested in this issue and l have started writing my project proposal on https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component.

        Show
        Eranda Sooriyabandara added a comment - I am interested in this issue and l have started writing my project proposal on https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component .
        Hide
        kinjal brahmbhatt added a comment -

        I am interested in this issue so kindly please do allow me to contribute for the same. Thanks.

        Show
        kinjal brahmbhatt added a comment - I am interested in this issue so kindly please do allow me to contribute for the same. Thanks.
        Hide
        Eranda Sooriyabandara added a comment -

        Here are some works I did up to today,
        1. Read about SCA
        2. Setup and checkout some samples of some NoSQL databases,
        In here I used the hector client for Apache Cassandra and couchdb4j (use JSON) client for Apache CouchDB. I think we can use some of this clients for our implementations. We need to decide which one to use.

        There are some topics which we need to discuss about as follows,
        Sample scenario which we are going to implement over various databases. In here for which extent are we going to implement the functionality?
        The REST API. We can discuss about this after finalize the scenario.

        After decide those we can use the database independent parts (REST API) as a SCA component and mock the database access.

        Show
        Eranda Sooriyabandara added a comment - Here are some works I did up to today, 1. Read about SCA 2. Setup and checkout some samples of some NoSQL databases, In here I used the hector client for Apache Cassandra and couchdb4j (use JSON) client for Apache CouchDB. I think we can use some of this clients for our implementations. We need to decide which one to use. There are some topics which we need to discuss about as follows, Sample scenario which we are going to implement over various databases. In here for which extent are we going to implement the functionality? The REST API. We can discuss about this after finalize the scenario. After decide those we can use the database independent parts (REST API) as a SCA component and mock the database access.
        Jean-Sebastien Delfino made changes -
        Assignee Eranda Sooriyabandara [ eranda ]
        Hide
        Eranda Sooriyabandara added a comment -

        Here I am attaching my basic twitapp program.
        TODO list
        1. Create a SCA component for the twitapp
        2. Create sample code for save/retrieve and modify data in NoSQL databases

        Show
        Eranda Sooriyabandara added a comment - Here I am attaching my basic twitapp program. TODO list 1. Create a SCA component for the twitapp 2. Create sample code for save/retrieve and modify data in NoSQL databases
        Eranda Sooriyabandara made changes -
        Attachment twitapp.tar.gz [ 12479163 ]
        Hide
        ant elder added a comment -

        Hi Eranda, I've created a folder for you in the Tuscany sandbox collaboration area as a place for you to put anything related to your GSoC project and I've committed the twitapp code there - https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/

        Show
        ant elder added a comment - Hi Eranda, I've created a folder for you in the Tuscany sandbox collaboration area as a place for you to put anything related to your GSoC project and I've committed the twitapp code there - https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/
        Hide
        Eranda Sooriyabandara added a comment -

        Hi Ant,
        Thanks for creating me a work space. I'll commit all my stuff to that.

        Show
        Eranda Sooriyabandara added a comment - Hi Ant, Thanks for creating me a work space. I'll commit all my stuff to that.
        Hide
        Eranda Sooriyabandara added a comment - - edited

        I am attaching the patch of the sample program we are going to use in the REST api sca component. Currently I have implemented for the Apache Cassandra.
        Please let me know your ideas about this.
        thanks

        Show
        Eranda Sooriyabandara added a comment - - edited I am attaching the patch of the sample program we are going to use in the REST api sca component. Currently I have implemented for the Apache Cassandra. Please let me know your ideas about this. thanks
        Eranda Sooriyabandara made changes -
        Attachment rest-api.patch [ 12480203 ]
        Hide
        Eranda Sooriyabandara added a comment -

        This patch represent basic functionality of rest API.
        Create and Manage Sessions
        Add database
        Add group
        (Group is a common structure which represent Apache Cassandra column family or document in Apache CouchDB)
        Add record
        Modify record
        Delete record
        Delete group
        Delete database
        Please let me know your ideas
        thanks

        Show
        Eranda Sooriyabandara added a comment - This patch represent basic functionality of rest API. Create and Manage Sessions Add database Add group (Group is a common structure which represent Apache Cassandra column family or document in Apache CouchDB) Add record Modify record Delete record Delete group Delete database Please let me know your ideas thanks
        Eranda Sooriyabandara made changes -
        Attachment rest-api-1.2.patch [ 12480630 ]
        Hide
        ant elder added a comment -

        I've applied the rest-api-1.2.patch at https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/cassandra. Sorry it took so long to get applied.

        Show
        ant elder added a comment - I've applied the rest-api-1.2.patch at https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/cassandra . Sorry it took so long to get applied.
        Hide
        Eranda Sooriyabandara added a comment -

        I implemented the couchdb API for the Couchdb SCA component. Please let me know your ideas about this patch.
        thanks

        Show
        Eranda Sooriyabandara added a comment - I implemented the couchdb API for the Couchdb SCA component. Please let me know your ideas about this patch. thanks
        Eranda Sooriyabandara made changes -
        Attachment couchdb-api.path [ 12482152 ]
        Hide
        Florian Moga added a comment -

        I've applied the couchdb-api.path at https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/couchdb/.

        A few thoughts for future patches regarding our way of work:
        1. I've noticed you started adding license headers to files, could you apply them to all the modules you've created?
        2. We usually don't commit eclipse files (.project, .classpath, .settings) as they might contain absolute paths to files in your local environment and can be easily generated by Maven. (you might find useful to add svn ignores for them)
        3. We usually don't use author/date/revision headers. SVN handles all this information automatically.

        Otherwise, things are looking good.

        Show
        Florian Moga added a comment - I've applied the couchdb-api.path at https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/couchdb/ . A few thoughts for future patches regarding our way of work: 1. I've noticed you started adding license headers to files, could you apply them to all the modules you've created? 2. We usually don't commit eclipse files (.project, .classpath, .settings) as they might contain absolute paths to files in your local environment and can be easily generated by Maven. (you might find useful to add svn ignores for them) 3. We usually don't use author/date/revision headers. SVN handles all this information automatically. Otherwise, things are looking good.
        Hide
        Eranda Sooriyabandara added a comment -

        Hi Florian,
        Thanks for the suggestions. I did the changes you suggest and attaching it as a patch.

        Show
        Eranda Sooriyabandara added a comment - Hi Florian, Thanks for the suggestions. I did the changes you suggest and attaching it as a patch.
        Eranda Sooriyabandara made changes -
        Attachment fix.patch [ 12482317 ]
        Hide
        Florian Moga added a comment -

        Excellent! I've applied your patch in your working space.

        One more thing, could you provide a build script (Maven might be most convenient) for the twitapp module? It would help building and importing the project into IDEs.

        Show
        Florian Moga added a comment - Excellent! I've applied your patch in your working space. One more thing, could you provide a build script (Maven might be most convenient) for the twitapp module? It would help building and importing the project into IDEs.
        Hide
        Eranda Sooriyabandara added a comment -

        I am attaching the HBase-API with this. Please let me know your ideas.

        Show
        Eranda Sooriyabandara added a comment - I am attaching the HBase-API with this. Please let me know your ideas.
        Eranda Sooriyabandara made changes -
        Attachment HBase-API.patch [ 12482978 ]
        Show
        Florian Moga added a comment - Patch applied at https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/hbase/ .
        Hide
        Eranda Sooriyabandara added a comment -

        This patch contains the missing Apache Licence headers.

        Show
        Eranda Sooriyabandara added a comment - This patch contains the missing Apache Licence headers.
        Eranda Sooriyabandara made changes -
        Attachment alheaders.patch [ 12483142 ]
        Show
        Florian Moga added a comment - Patch applied at https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda
        Hide
        Eranda Sooriyabandara added a comment -

        I fix the Cassandra datastore according to the Jean-Sebastian's suggestion and attach it here. Also I will commit it to the trunk.

        Show
        Eranda Sooriyabandara added a comment - I fix the Cassandra datastore according to the Jean-Sebastian's suggestion and attach it here. Also I will commit it to the trunk.
        Eranda Sooriyabandara made changes -
        Attachment cassandra.fix.patch [ 12485638 ]
        Hide
        Eranda Sooriyabandara added a comment -

        Here are the latest fixes of the 3 database API implementation. I will commit to the svn soon. Please let me know your ideas.

        Show
        Eranda Sooriyabandara added a comment - Here are the latest fixes of the 3 database API implementation. I will commit to the svn soon. Please let me know your ideas.
        Eranda Sooriyabandara made changes -
        Attachment cassandra.fix.patch [ 12485764 ]
        Attachment couchdb.fix.patch [ 12485765 ]
        Attachment hbase.fix.patch [ 12485766 ]
        Hide
        Eranda Sooriyabandara added a comment -

        I came up with some implementation of the Cassandra SCA component with some changes to the original code. But I still did not set the property in the components. Please let me know whether I am heading to correct way or not.
        thanks

        Show
        Eranda Sooriyabandara added a comment - I came up with some implementation of the Cassandra SCA component with some changes to the original code. But I still did not set the property in the components. Please let me know whether I am heading to correct way or not. thanks
        Eranda Sooriyabandara made changes -
        Attachment cassandra.zip [ 12488078 ]
        Hide
        Eranda Sooriyabandara added a comment -

        Hi Jean-Sebastian,
        Here is the code which I have currently.

        Show
        Eranda Sooriyabandara added a comment - Hi Jean-Sebastian, Here is the code which I have currently.
        Eranda Sooriyabandara made changes -
        Attachment cassandra.zip [ 12489614 ]
        Hide
        Eranda Sooriyabandara added a comment -

        I implemented a sample datastore component for Apache Cassandra. I added a service interface and the implemented class for it. I have attached my source with this.
        I tried out this result with a rest call service. (I used http://localhost:8085/DatastoreService/addEntry?database=TwitApp&group=twits&key=eranda&value=mahesh for calling the service) and it it didn't seems to be working. I check in the database and I didn't find any changes in the database. Please help me to know the reason for such behavior.
        thanks

        Show
        Eranda Sooriyabandara added a comment - I implemented a sample datastore component for Apache Cassandra. I added a service interface and the implemented class for it. I have attached my source with this. I tried out this result with a rest call service. (I used http://localhost:8085/DatastoreService/addEntry?database=TwitApp&group=twits&key=eranda&value=mahesh for calling the service) and it it didn't seems to be working. I check in the database and I didn't find any changes in the database. Please help me to know the reason for such behavior. thanks
        Eranda Sooriyabandara made changes -
        Attachment cassandra.zip [ 12490215 ]
        Hide
        Eranda Sooriyabandara added a comment -

        I am attaching my last update as a patch which I create REST services out of the datastores created using Apache Cassandra, Apache CouchDB and Apache Hadoop/hbase. Please let me know your ideas about the service implementation.

        Show
        Eranda Sooriyabandara added a comment - I am attaching my last update as a patch which I create REST services out of the datastores created using Apache Cassandra, Apache CouchDB and Apache Hadoop/hbase. Please let me know your ideas about the service implementation.
        Eranda Sooriyabandara made changes -
        Attachment rest.patch [ 12490994 ]
        ant elder made changes -
        Fix Version/s Future Ideas [ 12317619 ]

          People

          • Assignee:
            Eranda Sooriyabandara
            Reporter:
            Jean-Sebastien Delfino
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development