Description
Adding namedOutputs with AvroMultipleOutputs.addNamedOutput just adds them to a static map which is of course not available on the cluster during reduce execution.
The unit tests pass though since the Instance of AvroMultipleOutputs is the same in the Reducer as in the Job's main class, so the added schemas there are present.
Fix would be to add the namedOutput schemas to the job configuration so they can be parsed in the reducers. Example patch for the new mapreduce api is attached, but I suspect the problem is present in the mapred api also. What is the general approach for this? Fix both?
Cheers,
Johannes
Attachments
Attachments
Issue Links
- duplicates
-
AVRO-1215 AvroMultipleOutputs not working when specifying baseOutputPath
- Closed