Commons Configuration
  1. Commons Configuration
  2. CONFIGURATION-13

[configuration] XMLConfiguration ignore a specific encoding in XML declaration

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: All

      Description

      I found that usign ConfigurationFactory, XMLConfiguration can't recognize a
      specific encoding in XML declaration because of calling method
      "load(Reader in)" anytime, so I can use only one encoding depend on OS through
      all xml files.

      After finding this problem, I modified implementation to create
      org.w3c.dom.Document by calling
      "DocumentBuilder#parse(new InputSource(Inputstrem in))" to avoid characters
      garbled (gobbledygook), and then I confirmed to be recoginzed encoding.

      I suggest to correct methods which inner class FileConfigurationDelegate in
      XMLConfiguration change to override "load(Inputstrem in)", and
      call "XMLConfiguration.this.load(Inputstrem in);", and method
      "load(InputStream in)" in XMLConfiguration change to call
      "DocumentBuilder#parse(new InputSource(Inputstrem in))".

        Activity

        Hide
        Oliver Heger added a comment -

        I added a test case based on the test files provided in the first attachment
        (many thanks again). So this issue can now be closed.

        Show
        Oliver Heger added a comment - I added a test case based on the test files provided in the first attachment (many thanks again). So this issue can now be closed.
        Hide
        Oliver Heger added a comment -

        (In reply to comment #11)
        >
        > How can I specify the encoding of a properties file in configuration factory
        > config.xml? Or does it recognise the encoding automatically? (Well it's not like
        > XML which can specify the encoding at the beginning of the XML document.)
        >

        You can simply set an encoding attribute as demonstrated in the following example:

        <configuration>
        <properties fileName="test.properties" encoding="UTF-8"/>
        <properties fileName="test.properties.xml"/>
        <xml fileName="test.xml" encoding="ISO-8859-1"/>
        </configuration>

        (This works because the configuration definition files are read by
        Commons-Digester, which maps attributes to bean properties. So in this case the
        setEncoding() method of AbstractFileConfiguration will be called. This works for
        other properties, too.)

        Show
        Oliver Heger added a comment - (In reply to comment #11) > > How can I specify the encoding of a properties file in configuration factory > config.xml? Or does it recognise the encoding automatically? (Well it's not like > XML which can specify the encoding at the beginning of the XML document.) > You can simply set an encoding attribute as demonstrated in the following example: <configuration> <properties fileName="test.properties" encoding="UTF-8"/> <properties fileName="test.properties.xml"/> <xml fileName="test.xml" encoding="ISO-8859-1"/> </configuration> (This works because the configuration definition files are read by Commons-Digester, which maps attributes to bean properties. So in this case the setEncoding() method of AbstractFileConfiguration will be called. This works for other properties, too.)
        Hide
        Alex added a comment -

        I'm quite interested in using commons configuration.

        Our situation is described as following:
        1) We are running WebSphere on IBM zOS.
        2) The default encoding of WebSphere is ISO-8859-1
        3) The properties file is in IBM zOS file system and encoded in IBM-1047.

        How can I specify the encoding of a properties file in configuration factory
        config.xml? Or does it recognise the encoding automatically? (Well it's not like
        XML which can specify the encoding at the beginning of the XML document.)

        Show
        Alex added a comment - I'm quite interested in using commons configuration. Our situation is described as following: 1) We are running WebSphere on IBM zOS. 2) The default encoding of WebSphere is ISO-8859-1 3) The properties file is in IBM zOS file system and encoded in IBM-1047. How can I specify the encoding of a properties file in configuration factory config.xml? Or does it recognise the encoding automatically? (Well it's not like XML which can specify the encoding at the beginning of the XML document.)
        Hide
        Emmanuel Bourg added a comment -

        Thank you for the feedback. I leave this bug open until I commit a test case
        covering this issue.

        Show
        Emmanuel Bourg added a comment - Thank you for the feedback. I leave this bug open until I commit a test case covering this issue.
        Hide
        kunihara tetsuya added a comment -

        Sorry it's taken me so long to reply.

        I got the binary of 2005/04/12 nightly build,
        and made sure it works well.

        Thank you so much for your help.

        Show
        kunihara tetsuya added a comment - Sorry it's taken me so long to reply. I got the binary of 2005/04/12 nightly build, and made sure it works well. Thank you so much for your help.
        Hide
        Emmanuel Bourg added a comment -

        You are right, I submitted a fix in the current SVN tree, let me know how it
        works for you.

        Show
        Emmanuel Bourg added a comment - You are right, I submitted a fix in the current SVN tree, let me know how it works for you.
        Hide
        kunihara tetsuya added a comment -

        Thank you so much for your support.

        But, the patch seems not to modify FileConfigurationDelegate in
        XMLConfiguration. Because AbstractFileConfiguration#load(URL url) calls
        FileConfigurationDelegate#load(Inputstream), XMLConfiguration#load(Inputstrem)
        will not be called when I use ConfigurationFactory#getConfiguration().
        Here is the its sample. I'm Sorry that it is not patch...

        private class FileConfigurationDelegate extends AbstractFileConfiguration {
        public void load(InputStream in) throws ConfigurationException // add this

        { // add this UseXmlEncodingHierarchicalXMLConfiguration.this.load(in); // add this }

        // add this
        public void load(Reader in) throws ConfigurationException

        Show
        kunihara tetsuya added a comment - Thank you so much for your support. But, the patch seems not to modify FileConfigurationDelegate in XMLConfiguration. Because AbstractFileConfiguration#load(URL url) calls FileConfigurationDelegate#load(Inputstream), XMLConfiguration#load(Inputstrem) will not be called when I use ConfigurationFactory#getConfiguration(). Here is the its sample. I'm Sorry that it is not patch... private class FileConfigurationDelegate extends AbstractFileConfiguration { public void load(InputStream in) throws ConfigurationException // add this { // add this UseXmlEncodingHierarchicalXMLConfiguration.this.load(in); // add this } // add this public void load(Reader in) throws ConfigurationException
        Hide
        Emmanuel Bourg added a comment -

        Created an attachment (id=14592)
        Fix for XMLConfiguration

        This is a simple patch that should solve this issue for XMLConfiguration, I
        haven't tested it thoroughly yet.

        Also we will have to check that the value returned by getEncoding() is
        consistent with the actual encoding of the file.

        Show
        Emmanuel Bourg added a comment - Created an attachment (id=14592) Fix for XMLConfiguration This is a simple patch that should solve this issue for XMLConfiguration, I haven't tested it thoroughly yet. Also we will have to check that the value returned by getEncoding() is consistent with the actual encoding of the file.
        Hide
        Emmanuel Bourg added a comment -

        We will have the same issue with all configuration formats supporting an
        encoding declaration inside the file, that's all XML based formats (XML
        properties, XML property list), YAML and OGDL.

        Show
        Emmanuel Bourg added a comment - We will have the same issue with all configuration formats supporting an encoding declaration inside the file, that's all XML based formats (XML properties, XML property list), YAML and OGDL.
        Hide
        kunihara tetsuya added a comment -

        Sorry, I have a mistake.
        line 18 on TestMain.java
        factory.setConfigurationURL(new File("config.xml").toURL());
        please replace like this
        factory.setConfigurationURL(new File("test/config.xml").toURL());

        Show
        kunihara tetsuya added a comment - Sorry, I have a mistake. line 18 on TestMain.java factory.setConfigurationURL(new File("config.xml").toURL()); please replace like this factory.setConfigurationURL(new File("test/config.xml").toURL());
        Hide
        kunihara tetsuya added a comment -

        Created an attachment (id=14590)
        sample configuration files and test source codes

        Show
        kunihara tetsuya added a comment - Created an attachment (id=14590) sample configuration files and test source codes
        Hide
        kunihara tetsuya added a comment -

        OK, here are the sample configuration files and test source codes for ver1.0.

        Configuration files are
        ./test/config.xml : main config file
        ./test/testConf1.xml : ISO-8859-1
        ./test/testConf2.xml : UTF-8(but the content is included ISO-8859-1)
        ./test/testConf3.xml : UTF-16

        When I use "TestMain.java", a error occurred on reading "testConf3.xml"
        and its stack trace is "./stacktrace.txt".

        "TestMain_UseXmlEncoding.java", "UseXmlEncodingConfigurationFactory.java" and
        "UseXmlEncodingHierarchicalXMLConfiguration.java" are our implementations to
        avoid this problem. And config main file is "config_UseXmlEncoding.xml".
        Of course, The output of this program is correct.

        hoge : test1_hoge
        moge : test2_moge
        yoge : test3_yoge

        1. In Japan, we have six or more encodings represents Japanese,
        2. Windows-31J(on Windows), EUC-JP(on Linux), iso-2022-jp(on e-mail),
        3. UTF-8(Java), UTF-16 and Shift-JIS(nearly equal Windows-31J)....
        Show
        kunihara tetsuya added a comment - OK, here are the sample configuration files and test source codes for ver1.0. Configuration files are ./test/config.xml : main config file ./test/testConf1.xml : ISO-8859-1 ./test/testConf2.xml : UTF-8(but the content is included ISO-8859-1) ./test/testConf3.xml : UTF-16 When I use "TestMain.java", a error occurred on reading "testConf3.xml" and its stack trace is "./stacktrace.txt". "TestMain_UseXmlEncoding.java", "UseXmlEncodingConfigurationFactory.java" and "UseXmlEncodingHierarchicalXMLConfiguration.java" are our implementations to avoid this problem. And config main file is "config_UseXmlEncoding.xml". Of course, The output of this program is correct. hoge : test1_hoge moge : test2_moge yoge : test3_yoge In Japan, we have six or more encodings represents Japanese, Windows-31J(on Windows), EUC-JP(on Linux), iso-2022-jp(on e-mail), UTF-8(Java), UTF-16 and Shift-JIS(nearly equal Windows-31J)....
        Hide
        Oliver Heger added a comment -

        Is it possible for you to attach a short test configuration file which needs a
        special encoding? This would help us to verify if a fix works.

        Thanks.

        Show
        Oliver Heger added a comment - Is it possible for you to attach a short test configuration file which needs a special encoding? This would help us to verify if a fix works. Thanks.

          People

          • Assignee:
            Unassigned
            Reporter:
            kunihara tetsuya
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development