Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-1782

Encoding of text files during import should be confugurable

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.3
    • 2.3.1SDK
    • CAS Editor
    • None

    Description

      During import of text files into a corpus it seems to be impossible to control the encoding used. Looks like the default platform encoding is used (Latin 1 on Western Windows systems). The Eclipse default encoding settings for text files don't seem to affect import encoding. That makes it impossible to import documents with international characters in UTF8.
      Ideally the encoding should be selectable in a drop down field in the import wizard.

      Attachments

        Activity

          People

            joern Jörn Kottmann
            thomas.hampp@de.ibm.com Thomas Hampp
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: