Tapestry
  1. Tapestry
  2. TAPESTRY-2525

Properties files in a message catalog should be read using UTF-8 encoding, rather than default encoding

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.13
    • Fix Version/s: 5.0.14
    • Component/s: tapestry-core
    • Labels:
      None

      Description

      Allow different encodings to be used for properties files so that native2ascii is not necessary. Possibly utilise the new constructors in the Java library that take a Reader. (Added in 1.6)

        Issue Links

          Activity

          Andy Blower created issue -
          Hide
          Howard M. Lewis Ship added a comment -

          Tapestry 4 had a more elaborate system of meta-data used to determine the encoding when reading an individual properties file. Lets see how well it works when we just assume a UTF-8 encoding (which seems to read normal ASCII files quite well).

          Show
          Howard M. Lewis Ship added a comment - Tapestry 4 had a more elaborate system of meta-data used to determine the encoding when reading an individual properties file. Lets see how well it works when we just assume a UTF-8 encoding (which seems to read normal ASCII files quite well).
          Howard M. Lewis Ship made changes -
          Field Original Value New Value
          Summary Allow different encoding for properties files Properties files in a message catalog should be read using UTF-8 encoding, rather than default encoding
          Assignee Howard M. Lewis Ship [ hlship ]
          Howard M. Lewis Ship made changes -
          Resolution Fixed [ 1 ]
          Fix Version/s 5.0.14 [ 12313214 ]
          Status Open [ 1 ] Closed [ 6 ]
          Hide
          Andy Blower added a comment - - edited

          Unfortunately, the fix doesn't work as implemented Howard. Basically, the readStreamAsUTF8() method does nothing, because it reads the file as UTF8 and then encodes it again as UTF8. Even if the StringBuffer is encoded as ISO-8859-1, it cannot express the characters so this approach is not going to work as far as I can see.

          I've implemented a fix which reads UTF8 encoded properties files if the new Properties.load(Reader) method is available (JDK1.6 and above) and it works perfectly. Here's the fixed version of MessagesSourceImpl.readProperties() - I think that the CHARSET constant ("UTF-8") might be a candidate for a symbol so it can be changed, but since we're using UTF8 I'm not bothered.

          /**

          • Creates and returns a new map that contains properties read from the properties file.
            */
            private Map<String, String> readProperties(Resource resource)
            {
            if (!resource.exists()) return emptyMap;

          tracker.add(resource.toURL());

          Map<String, String> result = CollectionFactory.newCaseInsensitiveMap();

          Properties p = new Properties();
          InputStream is = null;

          try
          {
          is = resource.openStream();

          try

          { // Use new reader loader for > JDK1.6 via reflection. Method newLoader = Properties.class.getMethod("load", Reader.class); Reader propReader = new BufferedReader(new InputStreamReader(is, CHARSET)); newLoader.invoke(p, propReader); }

          catch (NoSuchMethodException e)

          { // Use old stream loader for < JDK1.6 (properties files must be ISO-8859-1 encoded) p.load(is); }

          is.close();

          is = null;
          }
          catch (Exception ex)

          { throw new RuntimeException(ServicesMessages.failureReadingMessages(resource, ex), ex); }

          finally

          { InternalUtils.close(is); }

          for (Map.Entry e : p.entrySet())

          { String key = e.getKey().toString(); String value = p.getProperty(key); result.put(key, value); }

          return result;
          }

          Show
          Andy Blower added a comment - - edited Unfortunately, the fix doesn't work as implemented Howard. Basically, the readStreamAsUTF8() method does nothing, because it reads the file as UTF8 and then encodes it again as UTF8. Even if the StringBuffer is encoded as ISO-8859-1, it cannot express the characters so this approach is not going to work as far as I can see. I've implemented a fix which reads UTF8 encoded properties files if the new Properties.load(Reader) method is available (JDK1.6 and above) and it works perfectly. Here's the fixed version of MessagesSourceImpl.readProperties() - I think that the CHARSET constant ("UTF-8") might be a candidate for a symbol so it can be changed, but since we're using UTF8 I'm not bothered. /** Creates and returns a new map that contains properties read from the properties file. */ private Map<String, String> readProperties(Resource resource) { if (!resource.exists()) return emptyMap; tracker.add(resource.toURL()); Map<String, String> result = CollectionFactory.newCaseInsensitiveMap(); Properties p = new Properties(); InputStream is = null; try { is = resource.openStream(); try { // Use new reader loader for > JDK1.6 via reflection. Method newLoader = Properties.class.getMethod("load", Reader.class); Reader propReader = new BufferedReader(new InputStreamReader(is, CHARSET)); newLoader.invoke(p, propReader); } catch (NoSuchMethodException e) { // Use old stream loader for < JDK1.6 (properties files must be ISO-8859-1 encoded) p.load(is); } is.close(); is = null; } catch (Exception ex) { throw new RuntimeException(ServicesMessages.failureReadingMessages(resource, ex), ex); } finally { InternalUtils.close(is); } for (Map.Entry e : p.entrySet()) { String key = e.getKey().toString(); String value = p.getProperty(key); result.put(key, value); } return result; }
          Andy Blower made changes -
          Status Closed [ 6 ] Reopened [ 4 ]
          Resolution Fixed [ 1 ]
          Howard M. Lewis Ship made changes -
          Resolution Fixed [ 1 ]
          Status Reopened [ 4 ] Closed [ 6 ]
          Mark Thomas made changes -
          Workflow jira [ 12435484 ] Default workflow, editable Closed status [ 12568769 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12568769 ] jira [ 12591788 ]
          Jochen Kemnade made changes -
          Link This issue is related to TAP5-2028 [ TAP5-2028 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Closed Closed
          2d 9h 29m 1 Howard M. Lewis Ship 19/Jul/08 18:50
          Closed Closed Reopened Reopened
          2d 22h 1m 1 Andy Blower 22/Jul/08 16:52
          Reopened Reopened Closed Closed
          15d 1h 21m 1 Howard M. Lewis Ship 06/Aug/08 18:14

            People

            • Assignee:
              Howard M. Lewis Ship
              Reporter:
              Andy Blower
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development