Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
0.11, 0.12.0, 0.11.1, 0.12.1, 0.13.0
-
None
-
None
Description
Based on PIG-2361, I took the liberty of extending @Outputschema so that more flexible output schema can be defined through annotations. As a result, the repeating patterns of EvalFunc#outputSchema() can be eliminated from most of the UDFs.
Examples:
@OutputSchema("bytearray")
=> equivalent to:
@Override public Schema outputSchema(Schema input) { return new Schema(new Schema.FieldSchema(null, DataType.BYTEARRAY)); }
@OutputSchema("chararray")
@Unique
=> equivalent to:
@Override public Schema outputSchema(Schema input) { return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(), input), DataType.CHARARRAY)); }
@OutputSchema(value = "dimensions:bag", useInputSchema = true)
=> equivalent to:
@Override public Schema outputSchema(Schema input) { return new Schema(new FieldSchema("dimensions", input, DataType.BAG)); }
@OutputSchema(value = "${0}:bag", useInputSchema = true) @Unique("${0}")
=> equivalent to:
@Override public Schema outputSchema(Schema input) { return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(), input), input, DataType.BAG)); }
If useInputSchema attribute is set then input schema will be applied to the output schema, provided that:
- outputschema is "simple", i.e: [name][:type] or '()', '{}', '[]' and
- it has complex field type (tuple, bag, map)
@Unique : this annotation defines which fields should be unique in the schema
- if no parameters are provided, all fields will be unique
- otherwise it takes a string array of fields name
Unique field generation:
A unique field is generated in the same manner that EvalFunc#getSchemaName does.
- if field has an alias:
- it's a placeholder (${i}, i=0..n) : fieldName -> com_myfunc_[input_alias]_[nextSchemaId]
- otherwise: fieldName -> fieldName_[nextSchemaId]
- otherwise: com_myfunc_[input_alias]_[nextSchemaId]
Supported scripting UDFs: Python, Jython, Groovy, JRuby
Attachments
Attachments
Issue Links
- is related to
-
PIG-2361 Update builtin UDFs to use @OutputSchema
- Open
- links to