My old pig script (to load xml files and to parse)which ran successfully through pig 0.13 version is not running with pig 0.14 and throwing ava.lang.IndexOutOfBoundsException: start 4, end 2, s.length() 2.
Out of my 10 xml files, 2 are running fine and rest 8 are not file..All these xml files ran successfully with pig 0.13 version. May be in new version, you have added more validations for well formed of xml files
My Code:
REGISTER '/usr/hdp/current/pig-client/lib/piggybank.jar';
C = LOAD '/common/data/dia/stepxml/*' using'Product') as (x:char array);
STORE C into '/common/data/dia/intermediate_xmls/Imn_Unique_both2';
2015-06-30 13:12:28,409 FATAL [IPC Server handler 3 on 34318] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1434729076270_34899_m_000015_0 - exited : java.lang.IndexOutOfBoundsException: start 4, end 2, s.length() 2
at java.lang.AbstractStringBuilder.append(
at java.lang.StringBuffer.append(
Failed to read data from "/common/data/dia/stepxml/*"
Failed to produce result in "/common/data/dia/intermediate_xmls/Imn_Unique_both2"
Issue Links
- is broken by
PIG-3865 Remodel the XMLLoader to work to be faster and more maintainable
- Closed