Index: src/main/java/org/apache/jackrabbit/api/JackrabbitValueFactory.java =================================================================== --- src/main/java/org/apache/jackrabbit/api/JackrabbitValueFactory.java (revision 1837264) +++ src/main/java/org/apache/jackrabbit/api/JackrabbitValueFactory.java (working copy) @@ -18,6 +18,7 @@ package org.apache.jackrabbit.api; +import java.io.InputStream; import javax.jcr.AccessDeniedException; import javax.jcr.Binary; import javax.jcr.RepositoryException; @@ -24,6 +25,7 @@ import javax.jcr.Session; import javax.jcr.ValueFactory; +import org.apache.jackrabbit.api.binary.BinaryDownloadOptions; import org.apache.jackrabbit.api.binary.BinaryUpload; import org.apache.jackrabbit.api.binary.BinaryDownload; import org.jetbrains.annotations.NotNull; @@ -36,17 +38,19 @@ * supporting all of the capabilities in this interface. Each method of the * interface describes the behavior of that method if the underlying capability * is not available. + * *
- * Currently this interface defines the following optional features: + * This interface defines the following optional features: *
* The features are described in more detail below. * *
+ * * The Direct Binary Access feature provides the capability for a client to * upload or download binaries directly to/from a storage location. For * example, this might be a cloud storage providing high-bandwidth direct @@ -53,21 +57,33 @@ * network access. This API allows for requests to be authenticated and for * access permission checks to take place within the repository, but for clients * to then access the storage location directly. + * *
- * The feature consists of two parts, direct binary upload and direct binary - * download. + * The feature consists of two parts, download and upload. + * + *
+ * For an existing {@link Binary} value that implements {@link BinaryDownload}, + * a {@link BinaryDownload#getURI(BinaryDownloadOptions) read-only URI} can be + * retrieved and passed to a remote client. * *
+ * * This feature enables remote clients to upload binaries directly to a storage * location. + * *
- * When adding binaries already present on the same JVM or server as Jackrabbit + * Note: When adding binaries already present on the same JVM or server as Jackrabbit * or Oak, for example because they were generated locally, please use the - * regular JCR API for {@link javax.jcr.Property#setValue(Binary) adding - * binaries through input streams} instead. This feature is solely designed for + * regular JCR API for adding {@link ValueFactory#createBinary(InputStream) binaries + * via input streams} instead. This feature is solely designed for * remote clients. + * *
* The direct binary upload process is split into 3 phases: *
- *
- * The direct binary download process is described in detail in {@link - * BinaryDownload}. */ @ProviderType public interface JackrabbitValueFactory extends ValueFactory { + /** - * Initiate a transaction to upload binary data directly to a storage - * location. {@link IllegalArgumentException} will be thrown if an upload + * Initiate a transaction to upload a binary directly to a storage + * location and return {@link BinaryUpload upload instruction} for a remote client. + * Returns null if the feature is not available. + * + *
+ * {@link IllegalArgumentException} will be thrown if an upload * cannot be supported for the required parameters, or if the parameters are - * otherwise invalid. For example, if the value of {@code maxSize} exceeds - * the size limits for a single binary upload for the implementation or the - * service provider, or if the value of {@code maxSize} divided by {@code - * maxParts} exceeds the size limit for an upload or upload part of the - * implementation or the service provider, {@link IllegalArgumentException} - * may be thrown. - *
- * Each service provider has specific limitations on upload sizes, - * multi-part upload support, part sizes, etc. which can result in {@link - * IllegalArgumentException} being thrown. You should consult the - * documentation for your underlying implementation and your service - * provider for details. - *
- * If this call is successful, a {@link BinaryUpload} is returned - * which contains the information a client needs to successfully complete - * a direct upload. + * otherwise invalid. Each service provider has specific limitations. * - * @param maxSize The expected maximum size of the binary to be uploaded by - * the client. If the actual size of the binary is known, this - * size should be used; otherwise, the client should make a best - * guess. If a client calls this method with one size and then - * later determines that the guess was too small, the transaction - * should be restarted by calling this method again with the correct - * size. + * @param maxSize The exact size of the binary to be uploaded or the + * estimated maximum size if the exact size is unknown. + * If the estimation was too small, the transaction + * should be restarted by invoking this method again + * using the correct size. * @param maxURIs The maximum number of upload URIs that the client can - * accept. The implementation will ensure that an upload of size - * {@code maxSize} can be completed by splitting the value of {@code - * maxSize} into parts, such that the size of the largest part does - * not exceed any known implementation or service provider - * limitations on upload part size and such that the number of parts - * does not exceed the value of {@code maxURIs}. If this is not - * possible, {@link IllegalArgumentException} will be thrown. A - * client may specify -1 for this value, indicating that any number - * of URIs may be returned. - * @return A {@link BinaryUpload} that can be used by the client to complete - * the upload via a call to {@link #completeBinaryUpload(String)}, + * accept, for example due to message size limitations. + * A value of -1 indicates no limit. + * Upon a successful return, it is ensured that an upload + * of {@code maxSize} can be completed by splitting the + * binary into {@code maxURIs} parts, otherwise + * {@link IllegalArgumentException} will be thrown. + * + * @return A {@link BinaryUpload} providing the upload instructions, * or {@code null} if the implementation does not support the direct * upload feature. + * * @throws IllegalArgumentException if the provided arguments are - * invalid or if a valid upload cannot be completed given the - * provided arguments. - * @throws AccessDeniedException if it is determined that insufficient - * permission exists to perform the upload. + * invalid or if an upload cannot be completed given the + * provided arguments. For example, if the value of {@code maxSize} + * exceeds the size limits for a single binary upload for the + * implementation or the service provider, or if the value of + * {@code maxSize} divided by {@code maxParts} exceeds the size + * limit for an upload or upload part. + * + * @throws AccessDeniedException if the session has insufficient + * permission to perform the upload. */ @Nullable BinaryUpload initiateBinaryUpload(long maxSize, int maxURIs) @@ -157,30 +159,25 @@ throws IllegalArgumentException, AccessDeniedException; /** - * Complete a transaction to upload binary data directly to a storage - * location. The client must provide a valid {@code uploadToken} that can - * only be obtained via a previous call to {@link - * #initiateBinaryUpload(long, int)}. If the {@code uploadToken} is - * unreadable or invalid, {@link IllegalArgumentException} will be thrown. + * Complete the transaction of uploading a binary directly to a storage + * location and return a {@link Binary} to set as value for a binary + * JCR property. The binary is not automatically associated with + * any location in the JCR. + * *
- * Calling this method does not associate the returned {@link Binary} with - * any location in the repository. It is the responsibility of the client - * to do this if desired. - *
- * The {@code uploadToken} can be obtained from the {@link - * BinaryUpload} returned from a prior call to {@link - * #initiateBinaryUpload(long, int)}. Clients should treat the {@code - * uploadToken} as an immutable string, and should expect that - * implementations will sign the string and verify the signature when this - * method is called. + * The client must provide a valid {@link BinaryUpload#getUploadToken() upload token} + * obtained when this transaction was {@link #initiateBinaryUpload(long, int) initialized}. + * If the {@code uploadToken} is unreadable or invalid, {@link IllegalArgumentException} + * will be thrown. * - * @param uploadToken A String that is used to identify the direct upload - * transaction. + * @param uploadToken A String identifying the upload transaction. + * * @return The uploaded {@link Binary}, or {@code null} if the * implementation does not support the direct upload feature. - * @throws IllegalArgumentException if the {@code uploadToken} is - * unreadable or invalid. - * @throws RepositoryException if a repository access error occurs. + * + * @throws IllegalArgumentException if the {@code uploadToken} is invalid or + * does not identify a known binary upload. + * @throws RepositoryException if another error occurs. */ @Nullable Binary completeBinaryUpload(@NotNull String uploadToken) Index: src/main/java/org/apache/jackrabbit/api/binary/BinaryDownload.java =================================================================== --- src/main/java/org/apache/jackrabbit/api/binary/BinaryDownload.java (revision 1837264) +++ src/main/java/org/apache/jackrabbit/api/binary/BinaryDownload.java (working copy) @@ -33,25 +33,67 @@ @ProviderType public interface BinaryDownload extends Binary { /** - * Get a URI for downloading a {@link Binary} directly from a storage - * location with the provided {@link BinaryDownloadOptions}. This is - * probably a signed URI with a short TTL (time to live), although the API - * does not require it to be so. + * + * Returns a URI for downloading this binary directly from the storage location. + * *
- * The implementation will attempt to apply the specified {@code - * downloadOptions} to the subsequent download. For example, if the caller - * knows that the URI refers to a specific type of content, the caller can - * specify that content type by setting the internet media type and - * character encoding in the {@code downloadOptions}. The caller may also - * use a default instance obtained via {@link BinaryDownloadOptions#DEFAULT} - * in which case the caller is indicating that the default behavior of the - * service provider is acceptable. + * Using the {@code downloadOptions} parameter, some response headers of the + * download request can be overwritten, if supported by the storage provider. + * This is necessary to pass information that is only stored in the JCR in + * application specific structures, and not reliably available in the binary + * storage. * + * {@link BinaryDownloadOptions} supports, but is not limited to: + *
* Calling this method has the effect of instructing the service - * provider to set {@code charecterEncoding} as the "charset" parameter + * provider to set {@code characterEncoding} as the "charset" parameter * of the content type in the {@code Content-Type} header field of the * response to a request issued with a URI obtained by calling {@link * BinaryDownload#getURI(BinaryDownloadOptions)}. This value can be @@ -216,7 +216,7 @@ /** * Sets the filename of the {@link BinaryDownloadOptions} object to be - * built. + * built. This would typically be based on a JCR node name. *
* Calling this method has the effect of instructing the service * provider to set {@code fileName} as the filename in the {@code Index: src/main/java/org/apache/jackrabbit/api/binary/BinaryUpload.java =================================================================== --- src/main/java/org/apache/jackrabbit/api/binary/BinaryUpload.java (revision 1837264) +++ src/main/java/org/apache/jackrabbit/api/binary/BinaryUpload.java (working copy) @@ -25,43 +25,49 @@ import org.osgi.annotation.versioning.ProviderType; /** - * This extension interface provides a mechanism whereby a client can upload a - * binary directly to a storage location. An object of this type can be - * created by a call to {@link - * JackrabbitValueFactory#initiateBinaryUpload(long, int)} which will return an - * object of this type if the underlying implementation supports direct upload - * functionality. When calling this method, the client indicates the expected - * size of the binary and the number of URIs that it is willing to accept. The - * implementation will attempt to create an instance of this class that is - * suited to enabling the client to complete the upload successfully. + * Describes uploading a binary through HTTP requests in a single or multiple parts. + * This will be returned by + * {@link JackrabbitValueFactory#initiateBinaryUpload(long, int) initiateBinaryUpload()}. + * A high-level overview of the process can be found in {@link JackrabbitValueFactory}. + * *
- * Using an instance of this class, a client can then use one or more of the - * included URIs for uploading the binary directly by calling {@link - * #getUploadURIs()} and iterating through the URIs returned. Multi-part - * uploads are supported by the interface, although they may not be supported - * by the underlying implementation. + * A caller usually needs to pass the information provided by this interface to a remote + * client that is in possession of the actual binary, who then has to upload the binary + * using HTTP according to the logic described below. A remote client is expected to + * support multi-part uploads as per the logic described below, in case multiple + * URIs are returned. + * *
- * Once a client finishes uploading the binary data, the client must then call - * {@link JackrabbitValueFactory#completeBinaryUpload(String)} to complete the - * upload. This call requires an upload token which can be obtained from an - * instance of this class by calling {@link #getUploadToken()}. + * Once a remote client finishes uploading the binary data, the application must be notified + * and must then call {@link JackrabbitValueFactory#completeBinaryUpload(String) completeBinaryUpload()} to complete the + * upload. This completion requires the exact upload token obtained from {@link #getUploadToken()}. + * + *
- * Below is the detailed direct binary upload algorithm for the remote client. - *
- * In this example the following variables are used: + * Please be aware that if the size passed to + * {@link JackrabbitValueFactory#initiateBinaryUpload(long, int) initiateBinaryUpload()} was an estimation, + * and it turns out the actual binary is larger, there is no guarantee the upload + * will be possible using all {@link #getUploadURIs()} and the {@link #getMaxPartSize()}. + * In such cases, the application should restart the transaction using the correct size. + * + *
partSize = (fileSize + numUploadURIs - 1) / numUploadURIs*
- * Clients are not necessarily required to use all of the URIs provided. A - * client may choose to use fewer, or even only one of the URIs. However, - * regardless of the number of URIs used, they must be consumed in sequence. + * Remote clients must support multi-part uploading as per the + * {@link BinaryUpload upload algorithm}. Clients are not necessarily required + * to use all of the URIs provided. A client may choose to use fewer, or even + * only one of the URIs. However, it must always ensure the part size is between + * {@link #getMinPartSize()} and {@link #getMaxPartSize()}. These can reflect + * strict limitations of the storage provider. + * + *
+ * Regardless of the number of URIs used, they must be consumed in sequence, + * without skipping any, and the order of parts the original binary is split + * into must correspond exactly with the order of URIs. + * + *
* For example, if a client wishes to upload a binary in three parts and * there are five URIs returned, the client must use the first URI to * upload the first part, the second URI to upload the second part, and - * the third URI to upload the third part. The client is not required to - * use the fourth and fifth URIs. However, using the second URI to upload + * the third URI to upload the third part. The client is not required to + * use the fourth and fifth URIs. However, using the second URI to upload * the third part may result in either an upload failure or a corrupted * upload; likewise, skipping the second URI to use subsequent URIs may * result in either an upload failure or a corrupted upload. + * *
- * Clients should be aware that some storage providers have limitations on - * the minimum and maximum size of a binary payload for a single upload, so - * clients should take these limitations into account when deciding how many - * of the URIs to use. Underlying implementations may also choose to - * enforce their own limitations. - *
* While the API supports multi-part uploading via multiple upload URIs, - * implementations are not required to support multi-part uploading. If the + * implementations are not required to support multi-part uploading. If the * underlying implementation does not support multi-part uploading, a single * URI will be returned regardless of the size of the data being uploaded. + * *
* Some storage providers also support multi-part uploads by reusing a * single URI multiple times, in which case the implementation may also * return a single URI regardless of the size of the data being uploaded. - *
- * You should consult both the DataStore implementation documentation and - * the storage service provider documentation for details on such matters as - * multi-part upload support, upload minimum and maximum sizes, etc. * + *
- * Note that the API offers no guarantees that uploading parts of this size - * can successfully complete the requested upload using the URIs provided - * via {@link #getUploadURIs()}. In other words, clients wishing to perform - * a multi-part upload must split the upload into parts of at least this - * size, but the sizes may need to be larger in order to successfully - * complete the upload. + * Note that the API offers no guarantees that using this minimal part size + * is possible with the number of available {@link #getUploadURIs()}. This might + * not be the case if the binary is too large. Please refer to the + * {@link BinaryUpload upload algorithm} for correctly using this value. * - * @return The smallest size acceptable for multi-part uploads. + * @return The smallest part size acceptable for multi-part uploads. */ long getMinPartSize(); /** - * The largest part size a client may upload for a multi-part upload. This - * is usually either a service provider or implementation limitation. + * Return the largest possible part size in bytes. If a consumer wants to + * choose a custom part size, it cannot be larger than this value. + * If this returns -1, the maximum is unlimited. + * *
- * The API guarantees that a client can successfully complete a direct - * upload of the binary data of the requested size using the provided URIs - * by splitting the binary data into parts of the size returned by this - * method. - *
- * The client is not required to use part sizes of this size; smaller sizes - * may be used so long as they are at least as large as the size returned by - * {@link #getMinPartSize()}. - *
- * If the binary size specified by a client when calling {@link - * JackrabbitValueFactory#initiateBinaryUpload(long, int)} ends up being - * smaller than the actual size of the binary being uploaded, these API - * guarantees no longer apply, and it may not be possible to complete the - * upload using the URIs provided. In such cases, the client should restart - * the transaction using the correct size. + * The API guarantees that a client can split the binary of the requested + * size using this maximum part size and there will be sufficient URIs + * available in {@link #getUploadURIs()}. Please refer to the + * {@link BinaryUpload upload algorithm} for correctly using this value. * - * @return The maximum size of an upload part for multi-part uploads. + * @return The maximum part size acceptable for multi-part uploads or -1 + * if there is no limit. */ long getMaxPartSize(); /** - * Returns the upload token to be used in a subsequent call to {@link - * JackrabbitValueFactory#completeBinaryUpload(String)}. This upload token - * is used by the implementation to identify this upload. Clients should - * treat the upload token as an immutable string, as the underlying - * implementation may choose to implement techniques to detect tampering and - * reject the upload if the token is modified. + * Returns a token identifying this upload. This is required to finalize the upload + * at the end by calling {@link JackrabbitValueFactory#completeBinaryUpload(String)}. * - * @return This upload's unique upload token. + *
+ * The format of this string is implementation-dependent. Implementations must ensure + * that clients cannot guess tokens for existing binaries. + * + * @return A unique token identifying this upload. */ @NotNull String getUploadToken();