Details

    • Type: New Feature New Feature
    • Status: Closed
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: datacache, jdbc, jpa, kernel
    • Labels:
      None

      Description

      BLOB and CLOB fields can only be mapped in their entirety in OpenJPA. It would be nice to support fields of type java.io.InputStream (for BLOBs) and java.io.Reader (for CLOBs).

      The usage pattern could look like so:

      @Entity
      public class Employee {
      ...
      private InputStream photoStream;

      public void setPhotoStream(InputStream in)

      { photoStream = in; }

      public InputStream getPhotoStream()

      { return photoStream; }

      }

      So, when the user wants to provide a stream, she will set the InputStream field, and when the user wants to obtain a stream, she will use the field.

      The behavior of such an implementation would be a bit different than how other fields work, in that if the user set the stream and then consumed it within a single transaction, presumably no data would be written out to the database at commit time. But that is the nature of streams.

      (FTR, I think that I stole this idea from an email Craig Russell sent out years ago.)

      1. OPENJPA-130.patch
        46 kB
        Ignacio Andreu
      2. OPENJPA-130.patch
        36 kB
        Ignacio Andreu
      3. OPENJPA-130-2.patch
        9 kB
        Ignacio Andreu
      4. OPENJPA-130-3.patch
        25 kB
        Ignacio Andreu
      5. OPENJPA-130-DB2.patch
        7 kB
        Fay Wang
      6. OPENJPA-130-DB2-2.patch
        2 kB
        Fay Wang

        Issue Links

          Activity

          Hide
          Patrick Linskey added a comment -

          Ignacio Andreu will be working on this issue for the Google Summer of Code project. It would seem that JIRA only allows me to assign issues to committers, so I'm assigning it to myself to keep it out of the unassigned queue.

          Show
          Patrick Linskey added a comment - Ignacio Andreu will be working on this issue for the Google Summer of Code project. It would seem that JIRA only allows me to assign issues to committers, so I'm assigning it to myself to keep it out of the unassigned queue.
          Hide
          Patrick Linskey added a comment -

          Assigning to Ignacio, who is working on this as part of the Google Summer of Code internship program.

          Show
          Patrick Linskey added a comment - Assigning to Ignacio, who is working on this as part of the Google Summer of Code internship program.
          Hide
          Ignacio Andreu added a comment -

          Now is the time to decide what will be the best way to mark this fields. In my opinion we have two options, we can to use the @LOB annotation or to create a @Stream annotation. What do you think?

          Show
          Ignacio Andreu added a comment - Now is the time to decide what will be the best way to mark this fields. In my opinion we have two options, we can to use the @LOB annotation or to create a @Stream annotation. What do you think?
          Hide
          Craig L Russell added a comment -

          I don't yet see the need to create another annotation.

          If you annotate an InputStream with @LOB then OpenJPA should be smart enough to figure it out.
          @Entity
          public class Employee

          { ... @LOB @Column(name="PHOTO") private InputStream photoStream; ... }
          Show
          Craig L Russell added a comment - I don't yet see the need to create another annotation. If you annotate an InputStream with @LOB then OpenJPA should be smart enough to figure it out. @Entity public class Employee { ... @LOB @Column(name="PHOTO") private InputStream photoStream; ... }
          Hide
          Patrick Linskey added a comment -

          My only concern with Craig's suggestion is compatibility if the JPA spec team decides to go a different route in a future version. If we use our own annotation, then we can maintain compatibility for it even in the face of spec changes.

          Show
          Patrick Linskey added a comment - My only concern with Craig's suggestion is compatibility if the JPA spec team decides to go a different route in a future version. If we use our own annotation, then we can maintain compatibility for it even in the face of spec changes.
          Hide
          Ignacio Andreu added a comment -

          This patch was part of my Summer of Code work, all the tests work fine in MySQL, Oracle and SQL Server.

          Streams are mapped using @Persistent annotation.

          @Entity
          public class Employee

          { ... @Persistent private InputStream photoStream; ... }
          Show
          Ignacio Andreu added a comment - This patch was part of my Summer of Code work, all the tests work fine in MySQL, Oracle and SQL Server. Streams are mapped using @Persistent annotation. @Entity public class Employee { ... @Persistent private InputStream photoStream; ... }
          Hide
          Patrick Linskey added a comment -

          Great work, Ignacio! I checked in the patch (with a couple of minor whitespace tweaks).

          I'm going to leave this issue open for right now; we need to document the new feature still, and this will help us track that work.

          Show
          Patrick Linskey added a comment - Great work, Ignacio! I checked in the patch (with a couple of minor whitespace tweaks). I'm going to leave this issue open for right now; we need to document the new feature still, and this will help us track that work.
          Hide
          Ignacio Andreu added a comment -

          This patch corrects some bugs and add new test cases.

          New test cases & fixed bugs:

          • testUpdateWithNull
          • testUpdateANullObjectWithoutNull

          Modified:

          • blobBufferSize and clobBufferSize. I've increased the buffer value.

          BTW, I'm trying to add Streaming support to Derby, my approach is not finished yet, It loads the results from the database when I use the load() method but it fails when I try to load with a Query. Next week i will put here my approach.

          Show
          Ignacio Andreu added a comment - This patch corrects some bugs and add new test cases. New test cases & fixed bugs: testUpdateWithNull testUpdateANullObjectWithoutNull Modified: blobBufferSize and clobBufferSize. I've increased the buffer value. BTW, I'm trying to add Streaming support to Derby, my approach is not finished yet, It loads the results from the database when I use the load() method but it fails when I try to load with a Query. Next week i will put here my approach.
          Hide
          Ignacio Andreu added a comment -

          Sorry, my derby approach is success using the find() method

          Show
          Ignacio Andreu added a comment - Sorry, my derby approach is success using the find() method
          Hide
          Ignacio Andreu added a comment -

          This patch adds support for Streams in PostgreSQL and documentation about this issue.

          Show
          Ignacio Andreu added a comment - This patch adds support for Streams in PostgreSQL and documentation about this issue.
          Hide
          Patrick Linskey added a comment -

          Resolved with Ignacio's recent work. There is still an open issue regarding Postgres and database cleanup; this will be managed through a separate JIRA issue.

          Show
          Patrick Linskey added a comment - Resolved with Ignacio's recent work. There is still an open issue regarding Postgres and database cleanup; this will be managed through a separate JIRA issue.
          Hide
          Fay Wang added a comment -

          This is the steaming lob support for DB2.

          Show
          Fay Wang added a comment - This is the steaming lob support for DB2.
          Hide
          Fay Wang added a comment -

          The OPENJPA-130-DB2.patch provides streaming lob support for DB2. DB2 supports direct setBinary/CharacterStream call. This is unlike Oracle/Mysql, where an insert of empty stream is inserted first followed by an update of actual stream. The catch with DB2 is that for setIBinary/CharacterStream, the length is required for JDBC 3. When setBinaryStream is called, the length can be obtained by InputStream.available(). However, there is no API to get the length for Reader when setCharacterStream is called. In JDBC 4, this problem is resolved as setBinary/CharacterStream can be called without the length input parameter. The patch attached works properly in the following conditions:
          (1) JDBC 3: setBinaryStream works fine, but setCharacterStream will not work as the length of available char in the reader can not be obtained.

          (2) JDBC 4: setBinary/CharacterStream work fine.

          The test case org.apache.openjpa.jdbc.meta.strats.TestInputStreamLob and org.apache.openjpa.jdbc.meta.strats.TestReaderLob works fine when the above patch applies to the openjpa-trunk with JDBC 4.

          Show
          Fay Wang added a comment - The OPENJPA-130 -DB2.patch provides streaming lob support for DB2. DB2 supports direct setBinary/CharacterStream call. This is unlike Oracle/Mysql, where an insert of empty stream is inserted first followed by an update of actual stream. The catch with DB2 is that for setIBinary/CharacterStream, the length is required for JDBC 3. When setBinaryStream is called, the length can be obtained by InputStream.available(). However, there is no API to get the length for Reader when setCharacterStream is called. In JDBC 4, this problem is resolved as setBinary/CharacterStream can be called without the length input parameter. The patch attached works properly in the following conditions: (1) JDBC 3: setBinaryStream works fine, but setCharacterStream will not work as the length of available char in the reader can not be obtained. (2) JDBC 4: setBinary/CharacterStream work fine. The test case org.apache.openjpa.jdbc.meta.strats.TestInputStreamLob and org.apache.openjpa.jdbc.meta.strats.TestReaderLob works fine when the above patch applies to the openjpa-trunk with JDBC 4.
          Hide
          Milosz Tylenda added a comment -

          Hi Fay. This is an interesting addition. I have a few remarks.

          1. Please consider creating a new issue for this work. Then, when we create release notes for OpenJPA 2.1.0, it will be clear that the new feature is there.

          2. Small oversight:
          + if (log.isTraceEnabled())
          + log.error(ioe.toString(), ioe);

          Also, there are unwanted empty lines between comments and method bodies.

          3. Why are you doing the checks "if (ob instanceof InputStream)" and "if (ob instanceof Reader)"? If they are necessary, what other instances are possible to come as method parameters?

          4. It is my understanding that using InputStream.available() to determine the length is to make the best effort possible while with JDBC 3. However, because the method is used regardless of the JDBC version, I am afraid that in practice it will break the feature even for those using JDBC 4. The semantics of available() is that it will rather return the number of bytes untill the stream gets blocked instead of actual length of the stream. TestInputStreamLob works fine because it uses memory streams and available() returns the actual stream length but in a real world, users will rather use file or network streams and in such cases available() will not return the actual length but some smaller value.

          I would think of getting rid of using available() and limit the DB2 support to JDBC 4.

          Let me know if you need more clarification.

          Show
          Milosz Tylenda added a comment - Hi Fay. This is an interesting addition. I have a few remarks. 1. Please consider creating a new issue for this work. Then, when we create release notes for OpenJPA 2.1.0, it will be clear that the new feature is there. 2. Small oversight: + if (log.isTraceEnabled()) + log.error(ioe.toString(), ioe); Also, there are unwanted empty lines between comments and method bodies. 3. Why are you doing the checks "if (ob instanceof InputStream)" and "if (ob instanceof Reader)"? If they are necessary, what other instances are possible to come as method parameters? 4. It is my understanding that using InputStream.available() to determine the length is to make the best effort possible while with JDBC 3. However, because the method is used regardless of the JDBC version, I am afraid that in practice it will break the feature even for those using JDBC 4. The semantics of available() is that it will rather return the number of bytes untill the stream gets blocked instead of actual length of the stream. TestInputStreamLob works fine because it uses memory streams and available() returns the actual stream length but in a real world, users will rather use file or network streams and in such cases available() will not return the actual length but some smaller value. I would think of getting rid of using available() and limit the DB2 support to JDBC 4. Let me know if you need more clarification.
          Hide
          Fay Wang added a comment -

          It's not necessarily the length of the underlaying data, it's the amount of data that can be read without causing a resource to be blocked. When using fully materialized lobs, this number is the length, but when using locators or progressive references, this number is the amount of data in the buffer.

          If the length is not known, it can be set to -1 in JDBC3. (It's a JCC specific API). Calling setCharacterStream(int, Reader) in JDBC 4 is the same as calling setCharacterStream(int, Reader, -1) in the JDBC 3 API.

          Show
          Fay Wang added a comment - It's not necessarily the length of the underlaying data, it's the amount of data that can be read without causing a resource to be blocked. When using fully materialized lobs, this number is the length, but when using locators or progressive references, this number is the amount of data in the buffer. If the length is not known, it can be set to -1 in JDBC3. (It's a JCC specific API). Calling setCharacterStream(int, Reader) in JDBC 4 is the same as calling setCharacterStream(int, Reader, -1) in the JDBC 3 API.

            People

            • Assignee:
              Ignacio Andreu
              Reporter:
              Patrick Linskey
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development