Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-514

AvroUtils#parseSchemaFromFile fails when characters are written with Modified UTF-8 encoding

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.13.0
    • None
    • None

    Description

      Schema.Parser()#parse(InputStream) tries to read the bytes with character encoding UTF-8 and fails when data is encoded with modified UTF-8. Reading the schema file with UTF-8 and then converting it to schema should solve this problem.

      As a part of https://github.com/apache/incubator-gobblin/pull/2355 schema is created using Hive Columns, which will be written to the disk using modified UTF-8. When such a file is read using Schema.Parser()#parse(InputStream) it fails.

      Attachments

        Issue Links

          Activity

            People

              adsharma Aditya Sharma
              adsharma Aditya Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: