Details

      Description

      Currently the drda code in server and client assumes that the byte length of ddm parameters is equal to the character length. In the fix for DERBY-728, ddm parameters such as RDBNAM will be in UTF-8 and the character and byte length may not match. The code needs to allow for this.

      The primary purpose of this Jira is to enforce the DRDA length checking which is in bytes. For example for RDBNAM (database name), the limit is 255 bytes in length, not 255 characters. The limits are somewhat arbitrary and sad in my opinion. Certainly for Derby there should be no problem removing the limits, except for the DRDA spec constraint. With DERBY-728 and the introduction of UNICODEMGR, characters can take up to 4 bytes, so the calculation is more difficult.

      (I actually tried to sneak removing or expanding the limits in the ACR 7007 for UNICODEMGR but was told, rightfully so, that such a proposal needed to be a separate ACR. I am a bit concerned that especially since database names can be full paths we may rapidly exceed the limits)

      1. derby-4009_a_diff.txt
        10 kB
        Kathey Marsden
      2. derby-4009_sample2_diff.txt
        3 kB
        Kathey Marsden
      3. DERBY-4009.diff
        5 kB
        Tiago R. Espinha
      4. DERBY-4009.diff
        3 kB
        Tiago R. Espinha
      5. derby-4009-sample_diff.txt
        2 kB
        Kathey Marsden

        Activity

        Hide
        Tiago R. Espinha added a comment -

        Regressions ran fine, closing the issue.

        Show
        Tiago R. Espinha added a comment - Regressions ran fine, closing the issue.
        Hide
        Tiago R. Espinha added a comment -

        Marking as patch available.

        Show
        Tiago R. Espinha added a comment - Marking as patch available.
        Hide
        Tiago R. Espinha added a comment -

        There's a small problem with the patch, which lead to the following failure:

        There was 1 failure:
        1) test_errorcode(org.apache.derbyTesting.functionTests.tests.lang.ErrorCodeTest
        )junit.framework.AssertionFailedError: Unexpected row count expected:<161> but w
        as:<163>

        That test basically asserts all the possible error messages that the system can through. Since these messages are in fact hardcoded into the test, it wasn't expecting the two additional messages that I added.

        I've added the lines to the ErrorCodeTest and the test passes by itself now (it wouldn't before), so I think it's safe to say that it is ready for commit.

        Show
        Tiago R. Espinha added a comment - There's a small problem with the patch, which lead to the following failure: There was 1 failure: 1) test_errorcode(org.apache.derbyTesting.functionTests.tests.lang.ErrorCodeTest )junit.framework.AssertionFailedError: Unexpected row count expected:<161> but w as:<163> That test basically asserts all the possible error messages that the system can through. Since these messages are in fact hardcoded into the test, it wasn't expecting the two additional messages that I added. I've added the lines to the ErrorCodeTest and the test passes by itself now (it wouldn't before), so I think it's safe to say that it is ready for commit.
        Hide
        Kathey Marsden added a comment -

        I think the patch looks fine. If regressions pass I'll commit. I will take a look back at writeScalarString and try to understand it. Is this code I added or preexisting?

        Show
        Kathey Marsden added a comment - I think the patch looks fine. If regressions pass I'll commit. I will take a look back at writeScalarString and try to understand it. Is this code I added or preexisting?
        Hide
        Tiago R. Espinha added a comment -

        I'm attaching a small patch that should convert the remaining commands to use the new writeScalarString method.

        It needs to be noted that I didn't change a few of the commands I initially mentioned:

        • PRDID - it is already using the new method implicitly
        • SRVCLSNM - same as above
        • SRVRLSLV - same as above
        • TYPDEFNAM - there doesn't seem to be a length restriction on this command

        This patch is NOT intended for inclusion as I have not run regressions yet.

        Show
        Tiago R. Espinha added a comment - I'm attaching a small patch that should convert the remaining commands to use the new writeScalarString method. It needs to be noted that I didn't change a few of the commands I initially mentioned: PRDID - it is already using the new method implicitly SRVCLSNM - same as above SRVRLSLV - same as above TYPDEFNAM - there doesn't seem to be a length restriction on this command This patch is NOT intended for inclusion as I have not run regressions yet.
        Hide
        Tiago R. Espinha added a comment -

        I was going through the derby/client/net/Request class and I have some questions.

        @ Kathey, should I rely on the code you have provided for the writeScalarString() method? You mentioned some performance concerns and I have been trying to understand the code without much luck. Maybe you can help me here.

        For example, why do we always shift the codePoint right by 8 bits? Also, we do a «logical and» with binary 255 twice, does this have a reason? If we «AND» a byte against 255 doesn't that always give us the initial byte?

        We also have some hardcoded values that I would like to convert into constants, or at least put some inline comments explaining exactly what they are. For instance, at some point we increment the offset by 2 and before this we also invoke ensureLength() by adding the offset_, maxByteLength and 4. Why 4?

        Finally, I'll be submitting a patch later today that addresses EXTNAM, PRDID, SRVCLSNM, SRVNAM, SRVRLSLV and TYPDEFNAM as these seem to be the only codepoints that actually write strings of variable size. There are fields like RDBACCCL that just write bytes or words and to my understanding, these will be left untouched, as these are not encoding strings into bytes.

        Show
        Tiago R. Espinha added a comment - I was going through the derby/client/net/Request class and I have some questions. @ Kathey, should I rely on the code you have provided for the writeScalarString() method? You mentioned some performance concerns and I have been trying to understand the code without much luck. Maybe you can help me here. For example, why do we always shift the codePoint right by 8 bits? Also, we do a «logical and» with binary 255 twice, does this have a reason? If we «AND» a byte against 255 doesn't that always give us the initial byte? We also have some hardcoded values that I would like to convert into constants, or at least put some inline comments explaining exactly what they are. For instance, at some point we increment the offset by 2 and before this we also invoke ensureLength() by adding the offset_, maxByteLength and 4. Why 4? Finally, I'll be submitting a patch later today that addresses EXTNAM, PRDID, SRVCLSNM, SRVNAM, SRVRLSLV and TYPDEFNAM as these seem to be the only codepoints that actually write strings of variable size. There are fields like RDBACCCL that just write bytes or words and to my understanding, these will be left untouched, as these are not encoding strings into bytes.
        Hide
        Kathey Marsden added a comment -

        Attached is derby-4009_a_diff.txt that makes the changes for client RDBNAM, USERID, and PASSWD. There are more changes to go, but just want to check this in as an interim patch.

        Show
        Kathey Marsden added a comment - Attached is derby-4009_a_diff.txt that makes the changes for client RDBNAM, USERID, and PASSWD. There are more changes to go, but just want to check this in as an interim patch.
        Hide
        Kathey Marsden added a comment -

        attaching derby-4009_sample2_diff.txt that takes a different approach. This patch will call ensureLength() based on the largest possible converted byte length of the string. Then writes the bytes directly into the buffer and then pokes the length in in the end.
        This is similar to writeLDString in DDMWriter. I think this approach will have less of a performance impact. The only risk is that we might potentially expand the buffer unnecessarily, but since the buffer is big 32K and most DDM paramters are small, I don't think this is likely to occur. I think I prefer this approach.

        Show
        Kathey Marsden added a comment - attaching derby-4009_sample2_diff.txt that takes a different approach. This patch will call ensureLength() based on the largest possible converted byte length of the string. Then writes the bytes directly into the buffer and then pokes the length in in the end. This is similar to writeLDString in DDMWriter. I think this approach will have less of a performance impact. The only risk is that we might potentially expand the buffer unnecessarily, but since the buffer is big 32K and most DDM paramters are small, I don't think this is likely to occur. I think I prefer this approach.
        Hide
        Kathey Marsden added a comment -

        Marking patch available even though the patch is not for commit. I'm hoping to get feedback on whether this approach might cause a performance problem and if so what to do about it.

        Show
        Kathey Marsden added a comment - Marking patch available even though the patch is not for commit. I'm hoping to get feedback on whether this approach might cause a performance problem and if so what to do about it.
        Hide
        Kathey Marsden added a comment -

        Attached is a sample method change for this issue (derby-4009-sample_diff.txt). It is not for commit. There are other methods that need to be changed, but I wanted to post this one as a sample because I am a bit concerned about the approach. Instead of converting directly into the buffer, I am converting to a byte[] and then writing the bytes to the buffer. This is because I need to know the length of the byte array up front to call ensureLength(). I am concerned that this will have a performance impact. Does anyone have input on a more efficient way to change these methods that write the length before the bytes.

        Show
        Kathey Marsden added a comment - Attached is a sample method change for this issue (derby-4009-sample_diff.txt). It is not for commit. There are other methods that need to be changed, but I wanted to post this one as a sample because I am a bit concerned about the approach. Instead of converting directly into the buffer, I am converting to a byte[] and then writing the bytes to the buffer. This is because I need to know the length of the byte array up front to call ensureLength(). I am concerned that this will have a performance impact. Does anyone have input on a more efficient way to change these methods that write the length before the bytes.

          People

          • Assignee:
            Tiago R. Espinha
            Reporter:
            Kathey Marsden
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development