JSPWiki
  1. JSPWiki
  2. JSPWIKI-666

Related to UTF-8 file naming notations

    Details

    • Type: Wish Wish
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When i create pages with chinese titles for example: Page title is '維基中文測試' it stores the wiki page as '%E7%B6%AD%E5%9F%BA%E4%B8%AD%E6%96%87%E6%B8%AC%E8%A9%A6.txt'

      Due to this we are facing problem:
      1. To indentify the pages in back-end
      2. file names become too lengthy
      3. Zipping the folder with lengtheir names giving error

      Could someone suggest is there some alternative way to store specifically chinese related page titles and whether we store the file name same as ''維基中文測試.txt '

      We need solution for this little urgent. Thanks in advance.

        Activity

        P Saraswathi Sailaja created issue -
        Hide
        Harry Metske added a comment -

        You did not specify what you are using:

        • the OS
        • Java version
        • which container (Tomcat ?)
        • JSPWiki version
        • the codepage you are running, see also jspwiki.encoding property in jspwiki.properties

        Page names are URLEncoded to prevent problems when people try to create pages with special characters in the name, like "/" " " or "<" .

        Maybe we could encode it a bit smarter and allow Chinese characters as page names, but I'm not sure if this will break other components in JSPWiki.

        Show
        Harry Metske added a comment - You did not specify what you are using: the OS Java version which container (Tomcat ?) JSPWiki version the codepage you are running, see also jspwiki.encoding property in jspwiki.properties Page names are URLEncoded to prevent problems when people try to create pages with special characters in the name, like "/" " " or "<" . Maybe we could encode it a bit smarter and allow Chinese characters as page names, but I'm not sure if this will break other components in JSPWiki.
        Harry Metske made changes -
        Field Original Value New Value
        Priority Major [ 3 ] Minor [ 4 ]
        Hide
        P Saraswathi Sailaja added a comment -

        Thanks Metske for your response.
        Please find the details below:

        1. OS is windows
        2. Java 1.5 Version
        3. Tomcat 5.5
        4. JSPWiki 2.6.1
        5. In JSPWiki.properties file: – jspwiki.encoding =UTF-8

        Request you to suggest further if there is any possible solution.
        Thanks in advance.

        Show
        P Saraswathi Sailaja added a comment - Thanks Metske for your response. Please find the details below: 1. OS is windows 2. Java 1.5 Version 3. Tomcat 5.5 4. JSPWiki 2.6.1 5. In JSPWiki.properties file: – jspwiki.encoding =UTF-8 Request you to suggest further if there is any possible solution. Thanks in advance.
        Hide
        Harry Metske added a comment -

        I have been thinking a bit more about it, and I don't think it's a good idea to change this.
        If, and I say If, we could change it without breaking other components, we would break existing JSPWiki installations, if you already have a wiki with a lot of pages, upgrading would break horribly.

        In JSPWiki 3.0 it's another story too with the JCR repo (with the default shipped implementation, priha), you will get even longer file names....

        Any other opinions ?
        If not, I like to propose as won't fix.

        Show
        Harry Metske added a comment - I have been thinking a bit more about it, and I don't think it's a good idea to change this. If, and I say If, we could change it without breaking other components, we would break existing JSPWiki installations, if you already have a wiki with a lot of pages, upgrading would break horribly. In JSPWiki 3.0 it's another story too with the JCR repo (with the default shipped implementation, priha), you will get even longer file names.... Any other opinions ? If not, I like to propose as won't fix.
        Hide
        Murray Altheim added a comment -

        Hi Harry,

        While I realise that there are some significant issues with this, I believe the only real, long-term
        solution for this type of problem is to stop tying the page name to the file or record storage. In
        essence, what we're currently doing is using what is a metadata name as a record identifier.
        As this JIRA issue suggests, there are too many internationalisation and record management
        issues with the current approach.

        Breaking the connection between name and identifier would free up JSPWiki to provide
        multi-lingual page names, contextualised naming, etc. and the backends would be able to
        guarantee unique page identifiers and would be free to use these as persistent identifiers
        (permalinks) for pages. In the JSPWiki system I'd developed (i.e., the Topic Map based
        "Assertion Framework") it used wiki pages as terms in user-created ontology, where a page
        name change would potentially break an assertion. That kind of stability is important.

        I understand this could only be applied in the 3.0 code, but I heavily encourage the project
        to make that break (if that decision hasn't already been made) at that time. I agree that it's
        not possible with the 2.8.x code.

        Cheers,

        Murray

        Show
        Murray Altheim added a comment - Hi Harry, While I realise that there are some significant issues with this, I believe the only real, long-term solution for this type of problem is to stop tying the page name to the file or record storage. In essence, what we're currently doing is using what is a metadata name as a record identifier. As this JIRA issue suggests, there are too many internationalisation and record management issues with the current approach. Breaking the connection between name and identifier would free up JSPWiki to provide multi-lingual page names, contextualised naming, etc. and the backends would be able to guarantee unique page identifiers and would be free to use these as persistent identifiers (permalinks) for pages. In the JSPWiki system I'd developed (i.e., the Topic Map based "Assertion Framework") it used wiki pages as terms in user-created ontology, where a page name change would potentially break an assertion. That kind of stability is important. I understand this could only be applied in the 3.0 code, but I heavily encourage the project to make that break (if that decision hasn't already been made) at that time. I agree that it's not possible with the 2.8.x code. Cheers, Murray
        Hide
        Harry Metske added a comment -

        Completely agree, it would also free us from the numerous page renaming issues we have in 2.8.

        Show
        Harry Metske added a comment - Completely agree, it would also free us from the numerous page renaming issues we have in 2.8.
        Hide
        P Saraswathi Sailaja added a comment -

        Thanks for your inputs and suggestions related to UTF-8 chinese characters.

        Could you please suggest on the below scenario as well.

        When I am trying to replace the special characters with code for French Canadian language, the special character Apostrophe (') is not being accepted in the page title. Wikis is not identifying this special character.

        For example, Afficher+l%E2%80%99offre+centres+de+carri%C3%A8res.txt
        Here the special character Apostrophe (') code is %E2%80%99
        The page title is showing as: Afficher l Offre centres de carrières instead of Afficher l' Offre centres de carrières
        We need a solution for this or a workaround. Please treat this as urgent. Thanks

        Show
        P Saraswathi Sailaja added a comment - Thanks for your inputs and suggestions related to UTF-8 chinese characters. Could you please suggest on the below scenario as well. When I am trying to replace the special characters with code for French Canadian language, the special character Apostrophe (') is not being accepted in the page title. Wikis is not identifying this special character. For example, Afficher+l%E2%80%99offre+centres+de+carri%C3%A8res.txt Here the special character Apostrophe (') code is %E2%80%99 The page title is showing as: Afficher l Offre centres de carrières instead of Afficher l' Offre centres de carrières We need a solution for this or a workaround. Please treat this as urgent. Thanks

          People

          • Assignee:
            Unassigned
            Reporter:
            P Saraswathi Sailaja
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development