Description
Currently the projection schema is set based on the ptype:
private static <S> FormatBundle<AvroParquetInputFormat> getBundle(AvroType<S> ptype) { return FormatBundle.forInput(AvroParquetInputFormat.class) .set(AvroReadSupport.AVRO_REQUESTED_PROJECTION, ptype.getSchema().toString()) // ParquetRecordReader expects ParquetInputSplits, not FileSplits, so it // doesn't work with CombineFileInputFormat .set(RuntimeParameters.DISABLE_COMBINE_FILE, "true"); }
Sometimes a user wants a subset of columns as a projection. Need a mechanism to supply desired projection schema.