Details
-
Bug
-
Status: Patch Available
-
Critical
-
Resolution: Unresolved
-
1.6.0, 1.7.0, 1.8.0
Description
In Pig code, https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A private number "inputSchemaInternal" represents the schema. Setter and Getter are also provided
316 private Schema inputSchemaInternal=null; 328 /** 329 * This method is for internal use. It is called by Pig core in both front-end 330 * and back-end to setup the right input schema for EvalFunc 331 */ 332 public void setInputSchema(Schema input){ 333 this.inputSchemaInternal=input; 334 } 335 336 /** 337 * This method is intended to be called by the user in {@link EvalFunc} to get the input 338 * schema of the EvalFunc 339 */ 340 public Schema getInputSchema(){ 341 return this.inputSchemaInternal; 342 }
In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, class Summary extends EvalFunc. It uses a new number called inputSchema(vs. inputSchemaInternal used in class EvalFunc in Pig) to represent schema and override setInputSchema(), but the class does not override getInputSchema() to return inputSchema.
51 public class Summary extends EvalFunc<String> implements Algebraic { 54 private Schema inputSchema; 257 @Override 258 public void setInputSchema(Schema input) { 259 try { 260 // relation.bag.tuple 261 this.inputSchema=input.getField(0).schema.getField(0).schema; 262 saveSchemaToUDFContext(); 263 } catch (FrontendException e) { 264 throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from " + input, e); 265 } catch (RuntimeException e) { 266 throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from "+input, e); 267 } 268 }
If setInputSchema() of class Summary is called, inputSchema is set. But if we call getInputSchema() afterwards, it will return the value of inputSchemaInternal, which can be still null.
Attachments
Issue Links
- blocks
-
PARQUET-334 UT TestSummary failed with "java.lang.RuntimeException: Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from null" when Pig >=0.15
- Resolved