Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.14.0
-
None
-
Reviewed
Description
For data set containing 9 lines the aggregated warning message is displayed
2016-09-01 19:40:33,664 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning UDF_WARNING_1 6 time(s).
but in contained logs I see a separate log message "Cannot
extract group for input" for every not matching value
2016-09-01 19:40:28,115 INFO [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map: Aliases being processed per job phase (AliasName[line,offset]): M : b[10,4],b[-1,-1],extract_fields[17,17] C: R: 2016-09-01 19:40:28,122 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger: org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtrac t : Cannot extract group for input /v1=1&v3=9 2016-09-01 19:40:28,124 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger: org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtrac t : Cannot extract group for input /v2=3&v3=7 2016-09-01 19:40:28,124 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger: org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for input /v1=4&v3=6 2016-09-01 19:40:28,125 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger: org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for input /v2=5&v3=5 2016-09-01 19:40:28,125 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger: org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for input /v1=8&v3=2 2016-09-01 19:40:28,125 WARN [main] org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger: org.apache.pig.builtin.REGEX_EXTRACT(UDF_WARNING_1): RegexExtract : Cannot extract group for input /v3=9&v2=1
It does not log the warning messages in the task logs.
The patch for PIG-2207 was committed to
Pig 0.13+
In 0.12 we had a single counter for all UDF warnings, but in 0.13+ we have
separate counter and message for every unique warning log line.
Two lines below are unique
/v2=3&v3=7
/v1=4&v3=6
That's why Pig print both of them to the console.
Printing a separate log message for every data line slows down the overall performance as well.
Attachments
Attachments
Issue Links
- relates to
-
PIG-2207 Support custom counters for aggregating warnings from different udfs
- Resolved