Uploaded image for project: 'Daffodil'
  1. Daffodil
  2. DAFFODIL-2561

Fix uses of getBytes without an encoding specified

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Trivial
    • Resolution: Unresolved
    • None
    • None
    • Clean Ups
    • None

    Description

      Comment from interran in a pull request:

      I reviewed how we call getBytes in Daffodil in order to check for inconsistencies and best practices. I noticed two things: 1) we call getBytes("ascii") instead every other place where we want bytes from ASCII characters; and 2) we call getBytes without a charset name too many times. Java's platform default charset is specific to the user and OS. On many modern Linux systems, it's UTF-8. On Macs, it’s MacRoman. In the US on Windows, it's often CP1250, while in Europe it's CP1252 or in China it's often simplified Chinese (Big5 or a GB*). I'm agnostic whether we use "ascii", "US-ASCII", or import java.nio.charset.StandardCharsets and use StandardCharsets.US_ASCII (I see Daffodil typically uses all-lowercase strings most often to save space and typing), but we probably should create a bug to replace all parameter-less getBytes calls with getBytes("utf-8").

      I think most/all of our uses of getBytes that don't provide an encoding are in tests. But even if it doesn't affect the Daffodil source, it does make our tests fragile to a users encoding, and we are not consistent at all. We should fix this so all uses provided an encoding, and our encodings are consistent.

      Additionally, the String class has a constructor and accepts a byte array and an optional encoding. The same issue occurs if one does not provide an encoding. We should find all uses of this constructor and ensure they use an encoding.

      Attachments

        Activity

          People

            Unassigned Unassigned
            slawrence Steve Lawrence
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: