Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
I was working on adding support for Parquet to Crunch, and ran into the issue that Parquet always assumes that the value it returns is on the "value" side of the key-value pair of an InputFormat/OutputFormat. Crunch, for semi-sensible historical reasons, makes this position dependent on the PTypeFamily (Avro PTypes write to the key, Writable PTypes write to the value). Since the Parquet InputFormat/OutputFormat treat the two types the same way, we need a way for the Source and Target implementations to override the default configuration of the PTypes and choose the right side for the given format.
Patch that updates the Source and Target interfaces to enable them to override the Converter used by the PType as appropriate.