Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-63

PigStorage does not properly handle UTF8 data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.1.0
    • None
    • None

    Description

      From Ben:

      I just checked the code and the problem seems to be PigStorage. getNext() uses
      readLine() which does not handle UTF8 correctly. putNext() also uses default encoder rather than UTF8 explicitly.

      Internally and in BinStorage UTF8 appears to be handled correctly.

      Attachments

        1. utf8.patch
          3 kB
          Benjamin Reed
        2. utf8.patch
          7 kB
          Benjamin Reed
        3. utf8.patch
          9 kB
          Benjamin Reed
        4. utf8_v4.patch
          9 kB
          Olga Natkovich
        5. utf8_v5.patch
          9 kB
          Olga Natkovich
        6. utf8test.patch
          0.8 kB
          Benjamin Reed

        Activity

          People

            breed Benjamin Reed
            olgan Olga Natkovich
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: