Issue Details (XML | Word | Printable)

Key: JCR-1509
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Jukka Zitting
Reporter: Alexander Klimetschek
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Jackrabbit Content Repository

[SUBMISSION] Amazon S3 Persistence Manager Project

Created: 31/Mar/08 09:06 AM   Updated: 29/Apr/09 10:21 AM
Component/s: sandbox
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Zip Archive Licensed for inclusion in ASF works jackrabbit-amazon.zip 2008-03-31 09:08 AM Alexander Klimetschek 49 kB
Issue Links:
Reference
 

Resolution Date: 03/Sep/08 03:32 PM


 Description  « Hide
As I noted previously on the dev-list (http://markmail.org/search/?q=amazon+list%3Aorg.apache.jackrabbit.dev#query:amazon%20list%3Aorg.apache.jackrabbit.dev+page:1+mid:qw27gopsn4lnbde5+state:results) I have written an Amazon S3 bundle persistence manager for Jackrabbit. I want to submit the code for the sandbox, the full source is included in the zip file. Licensed under the ASF.

The project also aims to implement a normal persistence manager (which I abandoned in favor of the more efficient bundle pm, which is implemented, but does not work 100%), a file system impl for S3 (only rough structure present) and an SPI impl that connects to S3 (dreaming ;-)). For more infos, I will include the README.txt of the project here:

=================================================================
Welcome to Jackrabbit persistence for Amazon Webservices (ie. S3)
=================================================================

This module contains various persistence options for using
Amazon Webservices as backend for Jackrabbit / JCR. Amazon has
two persistence services: S3 (public) and SimpleDB (still beta).
The following options are available/ideas:

- (1) persistence managers that connects to S3
      (normal + bundle, in work, probably not very efficient)
      
- (2) persistence manager that connects to SimpleDB
      (NOT feasible)
      
- (3) SPI implementation that connects to S3
      (not implemented, very complicated, probably more efficient)
      
See details below and also TODO.txt


Installing / Testing
====================

This needs a patched Jackrabbit 1.3.x version. The patches can
be found in the directory "patches-for-1.3". One patch will modify
the pom of jackrabbit-core to generated the jackrabbit test jar
for reuse in this project. To build that customized version, you need
to do the following steps:

1) svn co http://svn.apache.org/repos/asf/jackrabbit/branches/1.3 jackrabbit-1.3
2) cd jackrabbit-1.3
3) apply all patches from the "patches-for-1.3" directory:
   patch -p0 < %JR-AMAZON-PATH%/patches-for-1.3/%PATCH%.patch
4) mvn install
5) cd %JR-AMAZON-PATH%
6) change jackrabbit version number in pom.xml to the one you just built
   (eg. project/parent/version = 1.3.4)
7) cp aws.properties.template aws.properties
8) enter your credentials in aws.properties
9) mvn test

For debugging, you can change the logging in applications/test/log4j.properties
and set up proxying (for monitoring the traffic with eg. tcp mon) in
applications/test/jets3t.properties.


Details about Implementations
=============================

(1) org.apache.jackrabbit.persistence.amazon.AmazonS3PersistenceManager

http://www.amazon.com/s3

Stores JCR Nodes and Properties inside S3 Objects. Uses UUID for Nodes and
UUID/Name for Properties as Object names. Node references are stored
via references/UUID.

Configuration parameters:

accessKey
    Amazon AWS access key (aka account user id) [required]

secretKey
    Amazon AWS secret key (aka account password) [required]
    
bucket
    Name of the S3 bucket to use [optional, default uses accessKey]
    Note that bucket names are global, so using your accessKey is
    recommended to prevent conflicts with other AWS users.
    
objectPrefix
    Prefix used for all object names [optional, default is ""]
    Should include the workspace name ("${wsp.name}" or "version" for
    the versioning PM) to put multiple workspaces into one bucket.

Example XML Config:

<PersistenceManager class="org.apache.jackrabbit.persistence.amazon.AmazonS3PersistenceManager">
    <param name="accessKey" value="abcde01234"/>
    <param name="secretKey" value="topsecret"/>
    <param name="bucket" value="abcde01234.jcrstore"/>
    <param name="objectPrefix" value="${wsp.name}/"/>
</PersistenceManager>

-----

(2) AmazonSimpleDBPersistenceManager

This is *not* feasible because of the restrictions that are applied
to SimpleDB. An item can only have up to 256 attributes, each attribute
can only contain a string value and that one can only have 1024 chars.
See this link for more information:

http://docs.amazonwebservices.com/AmazonSimpleDB/2007-11-07/DeveloperGuide/SDB_API_PutAttributes.html

-----

(3) org.apache.jackrabbit.spi2s3

TODO

lots of work...


About
=====

It was originally written by Alexander Klimetschek
(alexander.klimetschek at googlemail dot com) in 2008.

See the Apache Jackrabbit web site (http://jackrabbit.apache.org/)
for documentation and other information. You are welcome to join the
Jackrabbit mailing lists (http://jackrabbit.apache.org/mail-lists.html)
to discuss this component and to use the Jackrabbit issue tracker
(http://issues.apache.org/jira/browse/JCR) to report issues or request
new features.

Apache Jackrabbit is a project of the Apache Software Foundation
(http://www.apache.org).


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Alexander Klimetschek added a comment - 31/Mar/08 09:08 AM
The current state. The major todo:

- fix S3 PM/Bundle PM (pass all JCR API tests):
  Problem: HTTP calls to S3 hang every now and then forever; need a timeout and a retry,
  but I couldn't quickly figure out how (needs tweaking of jets3t lib + apache commons
  http client below)

See also TODO.txt included in the zip.

Alexander Klimetschek made changes - 31/Mar/08 09:08 AM
Field Original Value New Value
Attachment jackrabbit-amazon.zip [ 12378934 ]
Marc made changes - 25/Aug/08 03:23 PM
Link This issue is related to JCR-1724 [ JCR-1724 ]
Repository Revision Date User Message
ASF #691635 Wed Sep 03 15:29:07 UTC 2008 jukka JCR-1509: [SUBMISSION] Amazon S3 Persistence Manager Project

Committed submission by Alexander Klimetschek.
Files Changed
ADD /jackrabbit/sandbox/jackrabbit-amazon/NOTICE.txt
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit/test/amazon/tck/TestAll.java
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repository/nodetypes/custom_nodetypes.xml.install
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/workspaces/default
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/persistence/amazon/AmazonS3BundlePersistenceManager.java
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repository/nodetypes
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repository.xml
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repository/namespaces/ns_reg.properties.install
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/workspaces/test
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/fs/amazon/AmazonS3FileSystem.java
ADD /jackrabbit/sandbox/jackrabbit-amazon/pom.xml
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit/test/amazon/init/TestAll.java
ADD /jackrabbit/sandbox/jackrabbit-amazon/patches-for-1.3/jackrabbit-core.create-test-jars.patch
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repositoryStubImpl.properties
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/fs
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/workspaces
ADD /jackrabbit/sandbox/jackrabbit-amazon/README.txt
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/persistence
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/persistence/util/AmazonS3Exception.java
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache
ADD /jackrabbit/sandbox/jackrabbit-amazon/patches-for-1.3/jackrabbit-jcr-commons.support-sysprops-in-config-files-dirty.patch
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit/test/amazon/init
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit/test/amazon
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repository
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/repository/namespaces
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java
ADD /jackrabbit/sandbox/jackrabbit-amazon
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/persistence/amazon/AmazonS3PersistenceManager.java
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit/test/amazon/tck
ADD /jackrabbit/sandbox/jackrabbit-amazon/TODO.txt
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/jets3t.properties
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/jaas.config
ADD /jackrabbit/sandbox/jackrabbit-amazon/build.xml
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/test/java/org/apache/jackrabbit/test
ADD /jackrabbit/sandbox/jackrabbit-amazon/patches-for-1.3
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/workspaces/default/workspace.xml
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/persistence/util
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/workspaces/test/workspace.xml
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/persistence/amazon
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org
ADD /jackrabbit/sandbox/jackrabbit-amazon/LICENSE.txt
ADD /jackrabbit/sandbox/jackrabbit-amazon/applications/test/log4j.properties
ADD /jackrabbit/sandbox/jackrabbit-amazon/src
ADD /jackrabbit/sandbox/jackrabbit-amazon/src/main/java/org/apache/jackrabbit/fs/amazon
ADD /jackrabbit/sandbox/jackrabbit-amazon/aws.properties.template

Jukka Zitting added a comment - 03/Sep/08 03:32 PM
Committed the submission to sandbox/jackrabbit-amazon in revision 691635.

Resolving as Fixed. Let's use a separate issue for promoting this from sandbox once the code is stable.

Jukka Zitting made changes - 03/Sep/08 03:32 PM
Resolution Fixed [ 1 ]
Fix Version/s none [ 12312448 ]
Assignee Jukka Zitting [ jukkaz ]
Status Open [ 1 ] Resolved [ 5 ]
Jukka Zitting made changes - 06/Jan/09 04:02 PM
Status Resolved [ 5 ] Closed [ 6 ]
Jukka Zitting made changes - 29/Apr/09 10:21 AM
Fix Version/s none [ 12312448 ]
Jukka Zitting made changes - 07/Jul/09 01:02 PM
Workflow jira [ 12427748 ] no-reopen-closed, patch-avail [ 12468690 ]