Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4491

Streaming Python Bytearray Bugs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.12.1, 0.13.1, 0.14.1
    • 0.15.0
    • None
    • None
    • Reviewed

    Description

      While using a streaming python udf that returned a byte array we hit a couple of bugs.

      The first was:

      org.apache.pig.impl.streaming.StreamingUDFException: LINE : UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)

      and the second (after fixing the first) was a null pointer exception.

      I traced the problem to two issues:

      1. In the python controller the output from the udf was being logged as a unicode string which can fail for bytearrays.

      2. Newlines in the data at the start of a response weren't being handled properly on the Java side.

      I'm attaching a patch w/ tests that fix these two issues.

      Attachments

        1. PIG-4491.patch
          6 kB
          Jeremy Karn

        Activity

          People

            jeremykarn Jeremy Karn
            jeremykarn Jeremy Karn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: