Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2819

Make Oozie REST API accept multibyte characters for script Actions

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0.0b1
    • Component/s: None
    • Labels:
      None

      Description

      Submitted Pig action with client side xml failed via proxy submission when it contained multibyte characters.

      curl -i  -X POST -d @/tmp/pig.xml -H 'Content-Type: application/XML; charset=UTF-8' 'http://'localhost':11000/oozie/v1/jobs?jobtype=pig&action=start'
      

      Where

      $ hdfs dfs -cat /tmp/encoding/input.txt
      松
      林檎
      松
      
      $ cat /tmp/pig.xml 
      <configuration>
      <property>
      <name>fs.default.name</name>
      <value>hdfs://localhost:8020/</value>
      </property>
      <property>
      <name>mapred.job.tracker</name>
      <value>localhost:8032</value>
      </property>
      <property>
      <name>user.name</name>
      <value>hdfs</value>
      </property>
      <property>
      <name>oozie.pig.script</name>
      <value><![CDATA[
      lines = LOAD 'hdfs:///tmp/encoding/input.txt' USING PigStorage('\n') AS line;
      test = FILTER lines BY line == '松';
      STORE test INTO 'hdfs:///tmp/encoding/output' USING PigStorage('\n');
      ]]></value>
      </property>
      <property>
      <name>oozie.pig.script.params.size</name>
      <value>0</value>
      </property>
      <property>
      <name>oozie.pig.script.options.size</name>
      <value>0</value>
      </property>
      <property>
      <name>oozie.libpath</name>
      <value>hdfs:///user/oozie/share/lib</value>
      </property>
      <property>
      <name>oozie.use.system.libpath</name>
      <value>true</value>
      </property>
      <property>
      <name>oozie.proxysubmission</name>
      <value>true</value>
      </property>
      </configuration>
      

      In the Oozie launcher log, I could see

      lines = LOAD 'hdfs:///tmp/encoding/input.txt' USING PigStorage('\n') AS line;test = FILTER lines BY line == '~';STORE test INTO 'hdfs:///tmp/encoding/output' USING PigStorage('\n');
      

      was used instead of the intended 松

        Attachments

        1. OOZIE-2819-00.patch
          0.9 kB
          Attila Sasvari
        2. OOZIE-2819-01.patch
          5 kB
          Attila Sasvari
        3. OOZIE-2819-02.patch
          5 kB
          Attila Sasvari
        4. OOZIE-2819-03.patch
          6 kB
          Attila Sasvari
        5. OOZIE-2819-03-amendment.patch
          1 kB
          Attila Sasvari

          Activity

            People

            • Assignee:
              asasvari Attila Sasvari
              Reporter:
              asasvari Attila Sasvari
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: