Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.7.1.1
    • Fix Version/s: 10.7.1.1
    • Component/s: Network Server
    • Labels:
      None

      Description

      I'm branching out this issue into server and client changes. Since the changes are incremental and small, DERBY-728 would soon become confusing if it had to bear the patches for both server and client.

      For future reference, some patches have actually been applied in DERBY-728. These *ARE* also required for the server implementation.

      1. TestClient.java
        0.9 kB
        Tiago R. Espinha
      2. DERBY-4746.diff
        4 kB
        Tiago R. Espinha
      3. DERBY-4746.diff
        4 kB
        Tiago R. Espinha
      4. DERBY-4746.diff
        4 kB
        Tiago R. Espinha
      5. DERBY-4746_p3.diff
        5 kB
        Tiago R. Espinha
      6. DERBY-4746_p3.diff
        5 kB
        Tiago R. Espinha
      7. DERBY-4746_p3.diff
        4 kB
        Tiago R. Espinha
      8. DERBY-4746_p3.diff
        4 kB
        Tiago R. Espinha
      9. DERBY-4746_p2-tests.diff
        8 kB
        Tiago R. Espinha
      10. DERBY-4746_p2-tests.diff
        11 kB
        Tiago R. Espinha
      11. DERBY-4746_p2-impl.diff
        4 kB
        Tiago R. Espinha
      12. DERBY-4746_p2-impl.diff
        5 kB
        Tiago R. Espinha

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          [bulk update] Close all resolved issues that haven't been updated for more than one year.

          Show
          Knut Anders Hatlen added a comment - [bulk update] Close all resolved issues that haven't been updated for more than one year.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks, Tiago. The latest patch looks good to me. +1 to commit.

          Show
          Knut Anders Hatlen added a comment - Thanks, Tiago. The latest patch looks good to me. +1 to commit.
          Hide
          Tiago R. Espinha added a comment -

          Thank you Knut for your comments. I've included the changes you suggested in this patch as indeed, we were creating unnecessary byte arrays. As for removing them from the getByteLength(), that might be a little more tricky as we actually need to convert the String into UTF-8 bytes to get the length. I'm not sure if there'd be an easier way to go about this, considering that UTF-8 is a variable-length encoding scheme.

          Maybe we could individually get the length for each of the characters and keep a sum variable where we update the length. This way we'd only have a byte array with 4 bytes used at any given time. I'll leave this for future improvements though as the byte array should be collected by the GC anyway, so I don't think it will have any immediate noticeable impact for now.

          I've already ran suites.All and derbyall on this patch with no failures, so I'll leave it until tomorrow evening (GMT) and if no one objects, I'll go ahead and commit it.

          Show
          Tiago R. Espinha added a comment - Thank you Knut for your comments. I've included the changes you suggested in this patch as indeed, we were creating unnecessary byte arrays. As for removing them from the getByteLength(), that might be a little more tricky as we actually need to convert the String into UTF-8 bytes to get the length. I'm not sure if there'd be an easier way to go about this, considering that UTF-8 is a variable-length encoding scheme. Maybe we could individually get the length for each of the characters and keep a sum variable where we update the length. This way we'd only have a byte array with 4 bytes used at any given time. I'll leave this for future improvements though as the byte array should be collected by the GC anyway, so I don't think it will have any immediate noticeable impact for now. I've already ran suites.All and derbyall on this patch with no failures, so I'll leave it until tomorrow evening (GMT) and if no one objects, I'll go ahead and commit it.
          Hide
          Knut Anders Hatlen added a comment -

          The new logic in the patch looks correct to me, but the writeRDBNAM() method is a bit too aggressive on allocating byte arrays:

          1)
          + int len = currentManager.getByteLength(rdbnam);

          The UTF-8 CCSID manager will allocate a new byte array in order to find the length, and then throw it away.

          2)
          + /* Initialize the buffer at MAX_NAME - max length in bytes for RDBNAM */
          + ByteBuffer buff = ByteBuffer.allocate(CodePoint.MAX_NAME);

          This will allocate a byte buffer with length 255, but after DERBY-4805 it'll increase to ~64K, and then it's probably going to hurt.

          3)
          + /* Convert the RDBNAM into bytes using the current CCSID */
          + currentManager.convertFromJavaString(rdbnam, buff);

          Here, the UTF-8 CCSID manager will allocate a new byte array again, holding the contents of the string, before it puts it into the ByteBuffer.

          4)
          + /* Get the byte array out of the byte buffer */
          + int bytesLen = buff.position();
          + byte[] rdbBytes = new byte[bytesLen];

          And here yet another byte array is allocated before it's sent to DDMWriter.

          I think if writeRDBNAM() had used writeScalarPaddedString() instead of writeScalarPaddedBytes(), we could avoid some of these byte array allocations. We wouldn't avoid all of them, because writeScalarPaddedString() would still call getByteLength() and convertFromJavaString() internally, but at least 2 and 4 would go away.

          As a possible future improvement, we may also consider changing Utf8CcsidManager's implementation of convertFromJavaString(String,ByteBuffer) and getByteLength() so that they don't create any byte arrays internally.

          Show
          Knut Anders Hatlen added a comment - The new logic in the patch looks correct to me, but the writeRDBNAM() method is a bit too aggressive on allocating byte arrays: 1) + int len = currentManager.getByteLength(rdbnam); The UTF-8 CCSID manager will allocate a new byte array in order to find the length, and then throw it away. 2) + /* Initialize the buffer at MAX_NAME - max length in bytes for RDBNAM */ + ByteBuffer buff = ByteBuffer.allocate(CodePoint.MAX_NAME); This will allocate a byte buffer with length 255, but after DERBY-4805 it'll increase to ~64K, and then it's probably going to hurt. 3) + /* Convert the RDBNAM into bytes using the current CCSID */ + currentManager.convertFromJavaString(rdbnam, buff); Here, the UTF-8 CCSID manager will allocate a new byte array again, holding the contents of the string, before it puts it into the ByteBuffer. 4) + /* Get the byte array out of the byte buffer */ + int bytesLen = buff.position(); + byte[] rdbBytes = new byte [bytesLen] ; And here yet another byte array is allocated before it's sent to DDMWriter. I think if writeRDBNAM() had used writeScalarPaddedString() instead of writeScalarPaddedBytes(), we could avoid some of these byte array allocations. We wouldn't avoid all of them, because writeScalarPaddedString() would still call getByteLength() and convertFromJavaString() internally, but at least 2 and 4 would go away. As a possible future improvement, we may also consider changing Utf8CcsidManager's implementation of convertFromJavaString(String,ByteBuffer) and getByteLength() so that they don't create any byte arrays internally.
          Hide
          Tiago R. Espinha added a comment -

          derbyall passed as well so I'm marking this as patch available. Please review and if no one objects, I will go ahead and commit it.

          Show
          Tiago R. Espinha added a comment - derbyall passed as well so I'm marking this as patch available. Please review and if no one objects, I will go ahead and commit it.
          Hide
          Tiago R. Espinha added a comment -

          I'm attaching the final patch to this issue. I had some issues with the ByteBuffer handling but that's fixed now.

          This patch forces the write*String methods within DDMWriter to query the length of a String off the CCSID manager. It also forces all the sessions in the DRDAConnThread to start as EBCDIC and fixes the issue Knut found on DERBY-4799.

          DERBY-4799 is fixed by changing the way the server writes the RDBNAM back to the client within DRDAConnThread.writeRDBNAM(String). Before the patch, the RDBNAM was being sent back [serv. -> cli] in UTF-8 [surprising, since everything should be EBCDIC so far] and the client correctly expected EBCDIC. This would be fine for normal ASCII characters but whenever special Latin characters were used, this would cause the server to throw an exception and consequently, the client to crash.

          I've attached a test client that demonstrates this by crashing 10.5.3 and 10.3.3 servers, using their respective clients. By using a database name with more than three extra-ASCII characters, the server will throw the exception seen in DERBY-4799 whenever we try to send actual SQL commands to the server.

          It's also worth noting that with this patch, old clients will now be able to actually create databases with Latin characters on a new server. This could be seen as a positive regression as something that used to fail is now working, but I don't think it is any matter for concern.

          Finally, Knut has provided a test fixture on DERBY-4799 that uncovers part of this issue. With this fix the test passes, although I think we can probably add another fixture that does the same as my test client (that only would fail upon creating the table).

          So far suites.All has passed without any failures and I am now running derbyall.

          Show
          Tiago R. Espinha added a comment - I'm attaching the final patch to this issue. I had some issues with the ByteBuffer handling but that's fixed now. This patch forces the write*String methods within DDMWriter to query the length of a String off the CCSID manager. It also forces all the sessions in the DRDAConnThread to start as EBCDIC and fixes the issue Knut found on DERBY-4799 . DERBY-4799 is fixed by changing the way the server writes the RDBNAM back to the client within DRDAConnThread.writeRDBNAM(String). Before the patch, the RDBNAM was being sent back [serv. -> cli] in UTF-8 [surprising, since everything should be EBCDIC so far] and the client correctly expected EBCDIC. This would be fine for normal ASCII characters but whenever special Latin characters were used, this would cause the server to throw an exception and consequently, the client to crash. I've attached a test client that demonstrates this by crashing 10.5.3 and 10.3.3 servers, using their respective clients. By using a database name with more than three extra-ASCII characters, the server will throw the exception seen in DERBY-4799 whenever we try to send actual SQL commands to the server. It's also worth noting that with this patch, old clients will now be able to actually create databases with Latin characters on a new server. This could be seen as a positive regression as something that used to fail is now working, but I don't think it is any matter for concern. Finally, Knut has provided a test fixture on DERBY-4799 that uncovers part of this issue. With this fix the test passes, although I think we can probably add another fixture that does the same as my test client (that only would fail upon creating the table). So far suites.All has passed without any failures and I am now running derbyall.
          Hide
          Tiago R. Espinha added a comment -

          Attaching a new patch as my previous one has an issue with the ByteBuffer. Running regressions again.

          Show
          Tiago R. Espinha added a comment - Attaching a new patch as my previous one has an issue with the ByteBuffer. Running regressions again.
          Hide
          Tiago R. Espinha added a comment -

          This patch makes some changes on the DDMWriter to ensure that the length of the strings is calculated based on the CCSID manager, and not just a straight call to .length(). It also enforces that whenever a new session is started, the CCSID manager is pushed back to EBCDIC, as all exchanges between the client and the server must start in EBCDIC.

          This is just a consolidation patch and it should not make any changes to the outside behavior of the code. At this point, the CCSID negotiation isn't yet in place and as such the server always stays in EBCDIC.

          I have a few more changes to submit but I wanted to first check these in and make sure everything still works as it should.

          I'll be running regressions and I'll post the results shortly. Please feel free to review the patch.

          Show
          Tiago R. Espinha added a comment - This patch makes some changes on the DDMWriter to ensure that the length of the strings is calculated based on the CCSID manager, and not just a straight call to .length(). It also enforces that whenever a new session is started, the CCSID manager is pushed back to EBCDIC, as all exchanges between the client and the server must start in EBCDIC. This is just a consolidation patch and it should not make any changes to the outside behavior of the code. At this point, the CCSID negotiation isn't yet in place and as such the server always stays in EBCDIC. I have a few more changes to submit but I wanted to first check these in and make sure everything still works as it should. I'll be running regressions and I'll post the results shortly. Please feel free to review the patch.
          Hide
          Tiago R. Espinha added a comment -

          Reopening the issue to submit an additional patch required for the server implementation.

          Show
          Tiago R. Espinha added a comment - Reopening the issue to submit an additional patch required for the server implementation.
          Hide
          Tiago R. Espinha added a comment -

          Patches applied, all works as should, closing issue.

          Show
          Tiago R. Espinha added a comment - Patches applied, all works as should, closing issue.
          Hide
          Tiago R. Espinha added a comment -

          Thank you Bryan for reviewing and committing the patches.

          This does indeed resolve the issue as it is a branch of DERBY-728 that relates to the server part only.

          Now I'll start the work on the client part. Thanks again for committing, I'll be closing the issue.

          Show
          Tiago R. Espinha added a comment - Thank you Bryan for reviewing and committing the patches. This does indeed resolve the issue as it is a branch of DERBY-728 that relates to the server part only. Now I'll start the work on the client part. Thanks again for committing, I'll be closing the issue.
          Hide
          Bryan Pendleton added a comment -

          Committed imp.diff and the revised tests.diff to the trunk as a single commit, revision 980800.

          Tiago, does this resolve this issue? Or are additional patches intended here? I didn't
          mark the issue resolved because I wasn't sure of the overall plan here. Please mark
          the issue resolved (and closed) if it is in fact complete.

          Thanks for all the hard work on this; international character support in Derby is a nice
          feature and it will be great to have it!

          Show
          Bryan Pendleton added a comment - Committed imp.diff and the revised tests.diff to the trunk as a single commit, revision 980800. Tiago, does this resolve this issue? Or are additional patches intended here? I didn't mark the issue resolved because I wasn't sure of the overall plan here. Please mark the issue resolved (and closed) if it is in fact complete. Thanks for all the hard work on this; international character support in Derby is a nice feature and it will be great to have it!
          Hide
          Tiago R. Espinha added a comment -

          Bryan, my run of derbyall ran without failures.

          Show
          Tiago R. Espinha added a comment - Bryan, my run of derbyall ran without failures.
          Hide
          Bryan Pendleton added a comment -

          Thanks for the quick turnaround on the revised tests. I've been successful
          running the new tests in various configurations on my system. Please let
          me know how your re-run of the tests goes.

          Show
          Bryan Pendleton added a comment - Thanks for the quick turnaround on the revised tests. I've been successful running the new tests in various configurations on my system. Please let me know how your re-run of the tests goes.
          Hide
          Tiago R. Espinha added a comment -

          I've finally managed to iron out all the issues. Here's a quick diff of this patch versus the previous one:

          • Adds the commands switchToUtf8CcsidManager and deleteDatabase to the grammar of TestProto.java
            (The deleteDatabase is a no-op as the cleanup is done automatically using this harness)
          • Adds the UTF8 test strings that are used in TestProto.getString() to replace the placeholders in the protocol.tests file
            (The protocol.tests file does not have UTF8 strings itself)

          After applying this patch, the test run posted above by Bryan passes with no failures. I've also managed to see the database with international characters being created and then deleted.

          I'll be running derbyall overnight to make sure everything still works as should. It should be safe to say that this does not affect suites.All.

          Show
          Tiago R. Espinha added a comment - I've finally managed to iron out all the issues. Here's a quick diff of this patch versus the previous one: Adds the commands switchToUtf8CcsidManager and deleteDatabase to the grammar of TestProto.java (The deleteDatabase is a no-op as the cleanup is done automatically using this harness) Adds the UTF8 test strings that are used in TestProto.getString() to replace the placeholders in the protocol.tests file (The protocol.tests file does not have UTF8 strings itself) After applying this patch, the test run posted above by Bryan passes with no failures. I've also managed to see the database with international characters being created and then deleted. I'll be running derbyall overnight to make sure everything still works as should. It should be safe to say that this does not affect suites.All.
          Hide
          Tiago R. Espinha added a comment -

          I'm getting heaps of failures trying to fit this into the old test so it might take a while before I submit a new patch.

          Show
          Tiago R. Espinha added a comment - I'm getting heaps of failures trying to fit this into the old test so it might take a while before I submit a new patch.
          Hide
          Tiago R. Espinha added a comment -

          The additional change is small. I just need to add the new command to the testProtocol grammar and tell the Java code what to do with it, which is a rather small change.

          My only concern has to do with maintainability. We have two sets of everything for this test: two grammars, two separate logics for each of the same commands in the protocols.test file, etc - this shouldn't pose as a problem unless we're adding new grammar to the tests though, I just have a certain aversion to code repetition #

          For my tests, this shouldn't be a problem again I think.

          I do agree with your solution for DERBY-2031. As it is the tests aren't being ran as part of suites.All anyway, so whoever touches the DRDA code should be aware of these protocol tests. So as long as we document how to run them, I think this solution should be fine. I'll suggest this on DERBY-2031.

          With ant junit-pptesting you were indeed running the new JUnit test (ProtocolTest.java) and it is ran against the classes in your build tree.

          I will submit a new patch in a few minutes that adds the changes to the testProtocol.

          Show
          Tiago R. Espinha added a comment - The additional change is small. I just need to add the new command to the testProtocol grammar and tell the Java code what to do with it, which is a rather small change. My only concern has to do with maintainability. We have two sets of everything for this test: two grammars, two separate logics for each of the same commands in the protocols.test file, etc - this shouldn't pose as a problem unless we're adding new grammar to the tests though, I just have a certain aversion to code repetition # For my tests, this shouldn't be a problem again I think. I do agree with your solution for DERBY-2031 . As it is the tests aren't being ran as part of suites.All anyway, so whoever touches the DRDA code should be aware of these protocol tests. So as long as we document how to run them, I think this solution should be fine. I'll suggest this on DERBY-2031 . With ant junit-pptesting you were indeed running the new JUnit test (ProtocolTest.java) and it is ran against the classes in your build tree. I will submit a new patch in a few minutes that adds the changes to the testProtocol.
          Hide
          Bryan Pendleton added a comment -

          Thanks for drawing my attention to DERBY-2031; I had forgotten about that entirely.

          Kristian's final comment in that issue was:
          > Changes done to the protocol test file will affect both versions of the test.

          So I guess that means that, for now, the two-versions-of-the-same-test is our
          reality, so if it's not too hard to add the duplicate logic to the other test, I'd say
          that would be fine.

          It's far better to have too many tests than too few, and this does seem like a really
          special situation.

          Would the additional change be a large one?

          Also, are you anticipating writing additional tests in this area as part of your
          upcoming work? Or is this a one-time-only situation? If this will become an
          annoying thorn in your side than maybe we need to look for an alternate idea?

          I suppose that the alternate idea might be:
          1) Pick one version of the DERBY-2031 tests to preserve for the future (presumably,
          the JUnit version)
          2) Entirely discard the old version of the tests
          3) Document in the code and in the wiki, etc. how to run the new test, and explain
          to the community that this test is special and needs to be run, periodically, in
          this special fashion.

          BTW, when I did "ant junit-pptesting", was I running the new DERBY-2031 JUnit test? And was I doing that against the 'classes' directory in my build tree?

          Show
          Bryan Pendleton added a comment - Thanks for drawing my attention to DERBY-2031 ; I had forgotten about that entirely. Kristian's final comment in that issue was: > Changes done to the protocol test file will affect both versions of the test. So I guess that means that, for now, the two-versions-of-the-same-test is our reality, so if it's not too hard to add the duplicate logic to the other test, I'd say that would be fine. It's far better to have too many tests than too few, and this does seem like a really special situation. Would the additional change be a large one? Also, are you anticipating writing additional tests in this area as part of your upcoming work? Or is this a one-time-only situation? If this will become an annoying thorn in your side than maybe we need to look for an alternate idea? I suppose that the alternate idea might be: 1) Pick one version of the DERBY-2031 tests to preserve for the future (presumably, the JUnit version) 2) Entirely discard the old version of the tests 3) Document in the code and in the wiki, etc. how to run the new test, and explain to the community that this test is special and needs to be run, periodically, in this special fashion. BTW, when I did "ant junit-pptesting", was I running the new DERBY-2031 JUnit test? And was I doing that against the 'classes' directory in my build tree?
          Hide
          Tiago R. Espinha added a comment -

          Oh I see why this happens, it has to do with DERBY-2031.

          I think this is a mess in maintainability though... the original goal of DERBY-2031 was to convert the whole functionality of derbynet/testProtocol.java to a proper JUnit test and this is how impl/drda/ProtocolTest.java came to life. However, as it is, each time we add a feature to the JUnit protocol test we have to do the same for the other or it'll break.

          I'm not sure what to do about this. Do we want to maintain these two tests that do essentially the same? I can easily implement the switchToUtf8CcsidManager logic in the testProtocol as well but should I do it?

          Show
          Tiago R. Espinha added a comment - Oh I see why this happens, it has to do with DERBY-2031 . I think this is a mess in maintainability though... the original goal of DERBY-2031 was to convert the whole functionality of derbynet/testProtocol.java to a proper JUnit test and this is how impl/drda/ProtocolTest.java came to life. However, as it is, each time we add a feature to the JUnit protocol test we have to do the same for the other or it'll break. I'm not sure what to do about this. Do we want to maintain these two tests that do essentially the same? I can easily implement the switchToUtf8CcsidManager logic in the testProtocol as well but should I do it?
          Hide
          Bryan Pendleton added a comment -

          I can reproduce the testProtocol diff as a standalone test failure by:

          java -Dframework=DerbyNetClient org.apache.derbyTesting.functionTests.harness.RunTest derbynet/testProtocol.java

          Show
          Bryan Pendleton added a comment - I can reproduce the testProtocol diff as a standalone test failure by: java -Dframework=DerbyNetClient org.apache.derbyTesting.functionTests.harness.RunTest derbynet/testProtocol.java
          Hide
          Bryan Pendleton added a comment -

          'ant junit-pptesting' appears to complete successfully for me:

          junit-pptesting:
          [junit] Running org.apache.derby.PackagePrivateTestSuite
          [junit] Tests run: 222, Failures: 0, Errors: 0, Time elapsed: 31.608 sec

          So I just need to understand the testProtocol diff.

          Show
          Bryan Pendleton added a comment - 'ant junit-pptesting' appears to complete successfully for me: junit-pptesting: [junit] Running org.apache.derby.PackagePrivateTestSuite [junit] Tests run: 222, Failures: 0, Errors: 0, Time elapsed: 31.608 sec So I just need to understand the testProtocol diff.
          Hide
          Bryan Pendleton added a comment -

          Hi Tiago, thanks for the suggestion about junit-pptesting, I will give that a try.

          When I run the full regression suite in my setup, I see a failure in one of the modified tests:

          Failure Details:

                          • Diff file derbyall/derbynetclientmats/DerbyNetClient/derbynetmats/derbynetmats/testProtocol.diff
              • Start: testProtocol jdk1.6.0_20 DerbyNetClient derbynetmats:derbynetmats 2010-07-28 08:28:24 ***
                308 add
                > Unknown command, switchToUtf8CcsidManager in line 3548
                > Test UNICODEMGR at level 1208 while sending UTF8 characters in RDBNAM
                Test Failed.
              • End: testProtocol jdk1.6.0_20 DerbyNetClient derbynetmats:derbynetmats 2010-07-28 08:28:30 ***

          Do you think this indicates a build failure? A test setup problem?

          Show
          Bryan Pendleton added a comment - Hi Tiago, thanks for the suggestion about junit-pptesting, I will give that a try. When I run the full regression suite in my setup, I see a failure in one of the modified tests: Failure Details: Diff file derbyall/derbynetclientmats/DerbyNetClient/derbynetmats/derbynetmats/testProtocol.diff Start: testProtocol jdk1.6.0_20 DerbyNetClient derbynetmats:derbynetmats 2010-07-28 08:28:24 *** 308 add > Unknown command, switchToUtf8CcsidManager in line 3548 > Test UNICODEMGR at level 1208 while sending UTF8 characters in RDBNAM Test Failed. End: testProtocol jdk1.6.0_20 DerbyNetClient derbynetmats:derbynetmats 2010-07-28 08:28:30 *** Do you think this indicates a build failure? A test setup problem?
          Hide
          Tiago R. Espinha added a comment -

          Indeed Bryan you might also want to run: "ant junit-pptesting" as this runs the protocol.tests file where I've added a couple more tests.

          That runs fine with no failures on my end.

          Show
          Tiago R. Espinha added a comment - Indeed Bryan you might also want to run: "ant junit-pptesting" as this runs the protocol.tests file where I've added a couple more tests. That runs fine with no failures on my end.
          Hide
          Bryan Pendleton added a comment -

          Oh, I see now that the p2-impl and p2-tests patches were designed to be applied together;
          I missed that in the first examination, and was only running with the p2-impl patch.

          I will also apply the p2-tests patch and do a bit more testing.

          Show
          Bryan Pendleton added a comment - Oh, I see now that the p2-impl and p2-tests patches were designed to be applied together; I missed that in the first examination, and was only running with the p2-impl patch. I will also apply the p2-tests patch and do a bit more testing.
          Hide
          Bryan Pendleton added a comment -

          The patch applies and builds cleanly for me.

          I also had a quick read of the patch and didn't notice anything new.

          I'll give it a bit more testing and if I don't encounter any problems, I'll commit this patch.

          Show
          Bryan Pendleton added a comment - The patch applies and builds cleanly for me. I also had a quick read of the patch and didn't notice anything new. I'll give it a bit more testing and if I don't encounter any problems, I'll commit this patch.
          Hide
          Tiago R. Espinha added a comment -

          Attaching a new implementation patch that removes the line as per my previous comment. Considering this line wasn't being used anywhere, it's safe to say that the regressions pass still applies.

          Can anyone please review and commit?

          Show
          Tiago R. Espinha added a comment - Attaching a new implementation patch that removes the line as per my previous comment. Considering this line wasn't being used anywhere, it's safe to say that the regressions pass still applies. Can anyone please review and commit?
          Hide
          Tiago R. Espinha added a comment -

          Regarding Kathey's comments, I have now tried to use the JDBCDataSource instead but this poses as an issue as it tries to get a DataSource using TestConfiguration.getCurrent(). The current configuration during a ProtocolTest is client/server and not embedded, which makes it impossible to use this method to shutdown the database.

          I will remove EBCDIC_CCSID. That shouldn't have been there in the first place.

          Show
          Tiago R. Espinha added a comment - Regarding Kathey's comments, I have now tried to use the JDBCDataSource instead but this poses as an issue as it tries to get a DataSource using TestConfiguration.getCurrent(). The current configuration during a ProtocolTest is client/server and not embedded, which makes it impossible to use this method to shutdown the database. I will remove EBCDIC_CCSID. That shouldn't have been there in the first place.
          Hide
          Kathey Marsden added a comment -

          Hi Tiago,

          I looked at the patches (although in a bit of a hurry) and they look good except that I am not sure about adding the DriverManager Import to ProtocolTest because of possible J2ME issues. It may be fine as the network server tests don't actually run for J2ME but I am not sure. It might be better instead to use JDBCDataSource.getDataSource() and then JDBCDataSource.shutdownDatabase().

          Also I don't think public static final int EBCDIC_CCSID = 500; is used and might be a bit confusing since I UNICODEMGR uses 0 for EBCDIC as I recall. Sorry for the hurried comments. I hope another commiter picks up the patches.

          bye for now!

          Kathey

          Show
          Kathey Marsden added a comment - Hi Tiago, I looked at the patches (although in a bit of a hurry) and they look good except that I am not sure about adding the DriverManager Import to ProtocolTest because of possible J2ME issues. It may be fine as the network server tests don't actually run for J2ME but I am not sure. It might be better instead to use JDBCDataSource.getDataSource() and then JDBCDataSource.shutdownDatabase(). Also I don't think public static final int EBCDIC_CCSID = 500; is used and might be a bit confusing since I UNICODEMGR uses 0 for EBCDIC as I recall. Sorry for the hurried comments. I hope another commiter picks up the patches. bye for now! Kathey
          Hide
          Tiago R. Espinha added a comment -

          Regressions ran fine without failures.

          Show
          Tiago R. Espinha added a comment - Regressions ran fine without failures.
          Hide
          Tiago R. Espinha added a comment -

          Attaching two patches:

          • DERBY-4746_p2-impl.diff - puts in place all the switches necessary on the server-side to enable and disable UTF8 support.
          • DERBY-4746_p2-tests.diff - contains the protocol tests and changes to the framework required for the UTF8 protocol tests. I've added a new command to the protocol tests (deleteDatabase) so that these tests can cleanup after themselves. The UTF8 protocol tests try to create a database with UTF8 characters and as such they need to be able to delete it afterwards.

          Kathey suggested the cleanup be done on the tearDown in ProtocolTest.java but from what I could see, this is not a very good solution as the ProtocolTest actually has several fixtures (one per test in the protocol.tests file) and this way we'd be trying to delete the UTF8 database after every single fixture (there are about 150 currently).

          There's also a caveat about these tests. In the UTF8 fixtures, we have the following string '%UTF8TestString%'. This is then replaced with a few Chinese characters in ProtocolTest.java and it's just a workaround to be able to pass UTF8 characters on to the server (as the ProtocolTest tokenizer does not deal well with characters in unicode representation \uXXXX).

          Lastly, I've implemented the command 'switchToUtf8CcsidManager' in the protocol tests that essentially tells the framework to switch its DDMWriter and Reader to UTF8. Normally, a client would negotiate this through the MGRLVLS but the ProtocolTest doesn't take this into consideration and has to be forced into UTF8 mode.

          I'm now running regressions for these patches.

          Show
          Tiago R. Espinha added a comment - Attaching two patches: DERBY-4746 _p2-impl.diff - puts in place all the switches necessary on the server-side to enable and disable UTF8 support. DERBY-4746 _p2-tests.diff - contains the protocol tests and changes to the framework required for the UTF8 protocol tests. I've added a new command to the protocol tests (deleteDatabase) so that these tests can cleanup after themselves. The UTF8 protocol tests try to create a database with UTF8 characters and as such they need to be able to delete it afterwards. Kathey suggested the cleanup be done on the tearDown in ProtocolTest.java but from what I could see, this is not a very good solution as the ProtocolTest actually has several fixtures (one per test in the protocol.tests file) and this way we'd be trying to delete the UTF8 database after every single fixture (there are about 150 currently). There's also a caveat about these tests. In the UTF8 fixtures, we have the following string '%UTF8TestString%'. This is then replaced with a few Chinese characters in ProtocolTest.java and it's just a workaround to be able to pass UTF8 characters on to the server (as the ProtocolTest tokenizer does not deal well with characters in unicode representation \uXXXX). Lastly, I've implemented the command 'switchToUtf8CcsidManager' in the protocol tests that essentially tells the framework to switch its DDMWriter and Reader to UTF8. Normally, a client would negotiate this through the MGRLVLS but the ProtocolTest doesn't take this into consideration and has to be forced into UTF8 mode. I'm now running regressions for these patches.
          Hide
          Tiago R. Espinha added a comment -

          This new patch includes Kathey's recommendation. I've ran suites.All with no failures.

          Show
          Tiago R. Espinha added a comment - This new patch includes Kathey's recommendation. I've ran suites.All with no failures.
          Hide
          Kathey Marsden added a comment - - edited

          Hi Tiago,

          In Utf8CcsidManager.convertToJavaString I think it would be more efficient to just use.
          String(byte[] bytes, int offset, int length, String charsetName)
          Constructs a new String by decoding the specified subarray of bytes using the specified charset.

          Instead of creating a new byte array and copying into it.

          The rest looks good.

          Show
          Kathey Marsden added a comment - - edited Hi Tiago, In Utf8CcsidManager.convertToJavaString I think it would be more efficient to just use. String(byte[] bytes, int offset, int length, String charsetName) Constructs a new String by decoding the specified subarray of bytes using the specified charset. Instead of creating a new byte array and copying into it. The rest looks good.
          Hide
          Tiago R. Espinha added a comment -

          Regressions ran without failures. Marking as patch available. Please review and commit.

          Soon I'll be posting two more patches that roll in the remaining changes for server-side support and additional protocol tests to ensure the feature is working.

          Show
          Tiago R. Espinha added a comment - Regressions ran without failures. Marking as patch available. Please review and commit. Soon I'll be posting two more patches that roll in the remaining changes for server-side support and additional protocol tests to ensure the feature is working.
          Hide
          Tiago R. Espinha added a comment -

          Made a small change to the Utf8CcsidManagerTest to accommodate for the fact that the convertToJavaString takes the offset and numCount in bytes and not characters.

          Running regressions now.

          Show
          Tiago R. Espinha added a comment - Made a small change to the Utf8CcsidManagerTest to accommodate for the fact that the convertToJavaString takes the offset and numCount in bytes and not characters. Running regressions now.
          Hide
          Tiago R. Espinha added a comment -

          This patch is a small incremental patch on what has been done for DERBY-728 to accommodate the server implementation. It does not yet insert any switch to the UTF8 encoding. Following is a walkthrough the changes:

          • DDMWriter.java - When writing a scalar string, getting the length() off a string is no longer correct. When using EBCDIC this worked as ASCII characters in Java Strings take just 1 byte. However, what we require is the byte length and this differs based on the current ccsidManager (EbcdicCcsidManager maintains the old behavior).
          • CcsidManager.java - Added two constants that represent the manager level for each of the CCSIDs. Note that our current implementation has EBCDIC at level 0, where as the specification mentions that it should be at level 500. I'm ignoring this for now as it doesn't interfere with the implementation (I check whether the level is 1208, if it isn't then it is EBCDIC) but we might want to fix this in the future.

          The CcsidManager also gets a new abstract method called getByteLength(String str) that takes a String and returns the byte length of that String, using the method specified by each CCSID manager.

          • Utf8CcsidManager.java - Converting to a Java String from bytes now has a different behavior. After experimenting with the protocol tests I've noticed that the offset and numToConvert are given in bytes and not characters.
          Show
          Tiago R. Espinha added a comment - This patch is a small incremental patch on what has been done for DERBY-728 to accommodate the server implementation. It does not yet insert any switch to the UTF8 encoding. Following is a walkthrough the changes: DDMWriter.java - When writing a scalar string, getting the length() off a string is no longer correct. When using EBCDIC this worked as ASCII characters in Java Strings take just 1 byte. However, what we require is the byte length and this differs based on the current ccsidManager (EbcdicCcsidManager maintains the old behavior). CcsidManager.java - Added two constants that represent the manager level for each of the CCSIDs. Note that our current implementation has EBCDIC at level 0, where as the specification mentions that it should be at level 500. I'm ignoring this for now as it doesn't interfere with the implementation (I check whether the level is 1208, if it isn't then it is EBCDIC) but we might want to fix this in the future. The CcsidManager also gets a new abstract method called getByteLength(String str) that takes a String and returns the byte length of that String, using the method specified by each CCSID manager. Utf8CcsidManager.java - Converting to a Java String from bytes now has a different behavior. After experimenting with the protocol tests I've noticed that the offset and numToConvert are given in bytes and not characters.

            People

            • Assignee:
              Tiago R. Espinha
              Reporter:
              Tiago R. Espinha
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development