Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-365

Class Summary does not provide a getter to return inputSchema

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Patch Available
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 1.6.0, 1.7.0, 1.8.0
    • Fix Version/s: 1.8.0
    • Component/s: parquet-mr
    • Labels:

      Description

      In Pig code, https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A private number "inputSchemaInternal" represents the schema. Setter and Getter are also provided

      316     private Schema inputSchemaInternal=null;
      
      328     /**
      329      * This method is for internal use. It is called by Pig core in both front-end
      330      * and back-end to setup the right input schema for EvalFunc
      331      */
      332     public void setInputSchema(Schema input){
      333         this.inputSchemaInternal=input;
      334     }
      335 
      336     /**
      337      * This method is intended to be called by the user in {@link EvalFunc} to get the input
      338      * schema of the EvalFunc
      339      */
      340     public Schema getInputSchema(){
      341         return this.inputSchemaInternal;
      342     }
      

      In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, class Summary extends EvalFunc. It uses a new number called inputSchema(vs. inputSchemaInternal used in class EvalFunc in Pig) to represent schema and override setInputSchema(), but the class does not override getInputSchema() to return inputSchema.

      51  public class Summary extends EvalFunc<String> implements Algebraic {
      
      54     private Schema inputSchema;
      
      257   @Override
      258   public void setInputSchema(Schema input) {
      259     try {
      260       // relation.bag.tuple
      261       this.inputSchema=input.getField(0).schema.getField(0).schema;
      262       saveSchemaToUDFContext();
      263     } catch (FrontendException e) {
      264       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from " + input, e);
      265     } catch (RuntimeException e) {
      266       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from "+input, e);
      267     }
      268   }
      

      If setInputSchema() of class Summary is called, inputSchema is set. But if we call getInputSchema() afterwards, it will return the value of inputSchemaInternal, which can be still null.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                xiangli Xiang Li
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: