Derby
  1. Derby
  2. DERBY-4591

Documentation needed for global case-insensitive setting (DERBY-1748)

    Details

    • Type: Task Task
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.6.1.0
    • Fix Version/s: 10.6.1.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      The new feature implemented by DERBY-1748 needs to be documented. It appears that the following topics will need to be changed, at least:

      Reference Manual: collation=collation attribute

      Developer's Guide: "Creating a database with territory-based collation", "Character-based collation in Derby".

      It's pretty hard to tell from all the discussion under DERBY-1748 exactly what has changed and what the new feature does, so a functional spec for that issue that describes the new feature clearly would be very helpful.

      1. tdevdvlpcollation.html
        5 kB
        Kim Haase
      2. DERBY-4591-4.diff
        0.8 kB
        Kim Haase
      3. rrefattribcollation.html
        6 kB
        Kim Haase
      4. DERBY-4591-2.diff
        2 kB
        Kim Haase
      5. DERBY-4591.zip
        8 kB
        Kim Haase
      6. DERBY-4591.diff
        9 kB
        Kim Haase
      7. DERBY-4591.zip
        8 kB
        Kim Haase
      8. DERBY-4591.stat
        0.1 kB
        Kim Haase
      9. DERBY-4591.diff
        9 kB
        Kim Haase

        Issue Links

          Activity

          Hide
          Kim Haase added a comment -

          Closing, since changes have appeared in Latest Alpha Manuals.

          Show
          Kim Haase added a comment - Closing, since changes have appeared in Latest Alpha Manuals.
          Hide
          Kim Haase added a comment -

          Committed patch DERBY-4591-4.diff to documentation trunk at revision 937540. (There was no DERBY-4591-3.diff; I'm afraid I lost count.)

          Show
          Kim Haase added a comment - Committed patch DERBY-4591 -4.diff to documentation trunk at revision 937540. (There was no DERBY-4591 -3.diff; I'm afraid I lost count.)
          Hide
          Kim Haase added a comment -

          Using à with the correct LANG setting does work for both HTML and PDF, as you said (I think I knew this at some point but forgot!).

          So I'm attaching DERBY-4591-4.diff and the output file, tdevdvlpcollation.html. Will commit soon.

          Show
          Kim Haase added a comment - Using à with the correct LANG setting does work for both HTML and PDF, as you said (I think I knew this at some point but forgot!). So I'm attaching DERBY-4591 -4.diff and the output file, tdevdvlpcollation.html. Will commit soon.
          Hide
          Kim Haase added a comment -

          Thanks, Knut, for the reminder about the environment.

          I think this may be hopeless – the same thing happens both with the simple text editor I usually use (the one in CDE — very old) and NetBeans. There seems to be no way to control what format the editor saves in. I don't think it looks at the header at all.

          In both editors I used Compose-a-backtick to create what in the editor looked like an a with a grave accent, but in my terminal with the LANG variable set to en_US.UTF-8, it turned into an A with a tilde in a Solaris terminal window and a Greek lowercase alpha in a Windows command prompt window (when I did a grep for another word in the line that contained it). And of course I still get the "Invalid byte 2 of 3-byte UTF-8 sequence" error.

          Show
          Kim Haase added a comment - Thanks, Knut, for the reminder about the environment. I think this may be hopeless – the same thing happens both with the simple text editor I usually use (the one in CDE — very old) and NetBeans. There seems to be no way to control what format the editor saves in. I don't think it looks at the header at all. In both editors I used Compose-a-backtick to create what in the editor looked like an a with a grave accent, but in my terminal with the LANG variable set to en_US.UTF-8, it turned into an A with a tilde in a Solaris terminal window and a Greek lowercase alpha in a Windows command prompt window (when I did a grep for another word in the line that contained it). And of course I still get the "Invalid byte 2 of 3-byte UTF-8 sequence" error.
          Hide
          Knut Anders Hatlen added a comment -

          Hi Kim. I don't see this error if I add 'à' and save the file with UTF-8 encoding. I do see the same error if I save the file with ISO-8859-1 encoding. My text editor (Emacs) automatically picks the correct file encoding based on the header in the XML file, which says:

          <?xml version="1.0" encoding="utf-8"?>

          If you cannot get your editor to save as UTF-8, you may change the header to say ISO-8859-1 instead, and it should work.

          As to using escape sequences, the character showed up just fine in the PDF version when I tried. What's the locale you're using? According to http://db.apache.org/derby/manuals/dita.html#Setting+up+your+environment, you need to set the LANG variable to en_US.UTF-8.

          Show
          Knut Anders Hatlen added a comment - Hi Kim. I don't see this error if I add 'à' and save the file with UTF-8 encoding. I do see the same error if I save the file with ISO-8859-1 encoding. My text editor (Emacs) automatically picks the correct file encoding based on the header in the XML file, which says: <?xml version="1.0" encoding="utf-8"?> If you cannot get your editor to save as UTF-8, you may change the header to say ISO-8859-1 instead, and it should work. As to using escape sequences, the character showed up just fine in the PDF version when I tried. What's the locale you're using? According to http://db.apache.org/derby/manuals/dita.html#Setting+up+your+environment , you need to set the LANG variable to en_US.UTF-8.
          Hide
          Kim Haase added a comment -

          Putting the à character in the source file works, but when I try to build the book I get a DITA error:

          [pipeline] [Fatal Error] tdevdvlpcollation.dita:54:13: Invalid byte 2 of 3-byte UTF-8 sequence.
          [pipeline] org.xml.sax.SAXParseException: Invalid byte 2 of 3-byte UTF-8 sequence.

          Does anyone know how to solve this problem? If not, we may as well leave the Dev Guide topic as is.

          Show
          Kim Haase added a comment - Putting the à character in the source file works, but when I try to build the book I get a DITA error: [pipeline] [Fatal Error] tdevdvlpcollation.dita:54:13: Invalid byte 2 of 3-byte UTF-8 sequence. [pipeline] org.xml.sax.SAXParseException: Invalid byte 2 of 3-byte UTF-8 sequence. Does anyone know how to solve this problem? If not, we may as well leave the Dev Guide topic as is.
          Hide
          Kim Haase added a comment -

          Very sorry, one more thing. I was going to add that the reason I did not adopt the second suggestion is that the escape sequences look fine in the output HTML docs but show up as just "#" in the PDF. And I didn't think I had a way to type letters with accent marks directly in my environment – but I just experimented with the "Compose" key on my keyboard and I think I figured it out. So what the heck. Patch 3 coming up.

          Show
          Kim Haase added a comment - Very sorry, one more thing. I was going to add that the reason I did not adopt the second suggestion is that the escape sequences look fine in the output HTML docs but show up as just "#" in the PDF. And I didn't think I had a way to type letters with accent marks directly in my environment – but I just experimented with the "Compose" key on my keyboard and I think I figured it out. So what the heck. Patch 3 coming up.
          Hide
          Kim Haase added a comment -

          Committed patch DERBY-4591-2.diff to documentation trunk at revision 936999.

          Show
          Kim Haase added a comment - Committed patch DERBY-4591 -2.diff to documentation trunk at revision 936999.
          Hide
          Kim Haase added a comment -

          Attaching DERBY-4591-2.diff and rrefattribcollation.html, with changes to the collation attribute topic of the Reference Manual. I changed the wording of the suggested changes so that the order and syntax are parallel.

          I'll commit this soon.

          Show
          Kim Haase added a comment - Attaching DERBY-4591 -2.diff and rrefattribcollation.html, with changes to the collation attribute topic of the Reference Manual. I changed the wording of the suggested changes so that the order and syntax are parallel. I'll commit this soon.
          Hide
          Kim Haase added a comment -

          Thanks for the additional comments, Knut. It would be good to get this info into the documentation, so I'm reopening the issue to make that change.

          Show
          Kim Haase added a comment - Thanks for the additional comments, Knut. It would be good to get this info into the documentation, so I'm reopening the issue to make that change.
          Hide
          Knut Anders Hatlen added a comment -

          Hi Kim,
          The docs look good. Thanks for writing them!

          I've got a couple of ideas for some minor improvements:

          1) Reference manual: In the list of valid attribute values, I think it would be good to add one sentence explaining the typical meaning of the different strengths. For example:

          • PRIMARY typically means that only differences in the base letter are considered significant, case and accents are not considered significant.
          • SECONDARY typically means that case differences are not considered significant, whereas differences in base letters or accents are significant.
          • TERTIARY typically means that differences in the base letter, accents or case are all considered significant.
          • IDENTICAL means that all differences are considered significant.

          2) tdevdvlpcollation: The "'a' with grave" part could be written as "'à' ('a' with grave)" for clarity. (The escape sequence for "à" is "à", if you would like to avoid non-ascii characters in the source file, although it should be fine to type the character directly and save as UTF-8, see the declaration of encoding in the header.)

          Otherwise, the changes look great.

          Show
          Knut Anders Hatlen added a comment - Hi Kim, The docs look good. Thanks for writing them! I've got a couple of ideas for some minor improvements: 1) Reference manual: In the list of valid attribute values, I think it would be good to add one sentence explaining the typical meaning of the different strengths. For example: PRIMARY typically means that only differences in the base letter are considered significant, case and accents are not considered significant. SECONDARY typically means that case differences are not considered significant, whereas differences in base letters or accents are significant. TERTIARY typically means that differences in the base letter, accents or case are all considered significant. IDENTICAL means that all differences are considered significant. 2) tdevdvlpcollation: The "'a' with grave" part could be written as "'à' ('a' with grave)" for clarity. (The escape sequence for "à" is "à", if you would like to avoid non-ascii characters in the source file, although it should be fine to type the character directly and save as UTF-8, see the declaration of encoding in the header.) Otherwise, the changes look great.
          Hide
          Kim Haase added a comment -

          Committed patch DERBY-4591.diff to documentation trunk at revision 932066.

          Feel free to reopen this issue if more work is needed.

          Show
          Kim Haase added a comment - Committed patch DERBY-4591 .diff to documentation trunk at revision 932066. Feel free to reopen this issue if more work is needed.
          Hide
          Kim Haase added a comment -

          I plan to commit this patch in a couple of working days if I don't get any requests for changes.

          Show
          Kim Haase added a comment - I plan to commit this patch in a couple of working days if I don't get any requests for changes.
          Hide
          Kim Haase added a comment -

          Attaching revised versions of DERBY-4591.diff and DERBY-4591.zip to include Gunnar's suggestion about mentioning compatibility with other databases.

          Show
          Kim Haase added a comment - Attaching revised versions of DERBY-4591 .diff and DERBY-4591 .zip to include Gunnar's suggestion about mentioning compatibility with other databases.
          Hide
          Kim Haase added a comment -

          Attaching DERBY-4591.diff, DERBY-4591.stat, and DERBY-4591.zip, with changes to two collation topics in the Developer's Guide and one in the Reference Manual. In addition to the changes noted in the files refman.txt and devguide.txt attached to DERBY-1748, I made some corrections in the "Character-based collation in Derby" topic, and added some cross-references among the three topics.

          Thanks in advance for feedback!

          Show
          Kim Haase added a comment - Attaching DERBY-4591 .diff, DERBY-4591 .stat, and DERBY-4591 .zip, with changes to two collation topics in the Developer's Guide and one in the Reference Manual. In addition to the changes noted in the files refman.txt and devguide.txt attached to DERBY-1748 , I made some corrections in the "Character-based collation in Derby" topic, and added some cross-references among the three topics. Thanks in advance for feedback!

            People

            • Assignee:
              Kim Haase
              Reporter:
              Kim Haase
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development