Apache Drill
  1. Apache Drill
  DRILL-7435

Project operator incorrectly adds a LATE type to union vector



    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      Run Drill with a fix for DRILL-7434. Now, another test fails: TestJsonReader.testTypeCase() fails when it tries to set the value count. Evidently the Project operator has added the LATE type to the Union vector. However, there is no vector type associated with the LATE type. An attempt to get the member or this type throws an exception.

      The simple work around is to special-case this type when setting the value count. The longer-term fix is to not add the LATE type to a union vector.

      The problem appears to occur here:

      Daemon Thread [2240a19e-344e-9a8b-f3d9-2a1550662b1b:frag:0:0] (Suspended (breakpoint at line 2091 in TypeProtos$MajorType$Builder))	
      	TypeProtos$MajorType$Builder.addSubType(TypeProtos$MinorType) line: 2091	
      	DefaultReturnTypeInference.getType(List<LogicalExpression>, FunctionAttributes) line: 58	
      	FunctionTemplate$ReturnType.getType(List<LogicalExpression>, FunctionAttributes) line: 195	
      	DrillSimpleFuncHolder(DrillFuncHolder).getReturnType(List<LogicalExpression>) line: 401	
      	DrillFuncHolderExpr.<init>(String, DrillFuncHolder, List<LogicalExpression>, ExpressionPosition) line: 39	
      	DrillSimpleFuncHolder(DrillFuncHolder).getExpr(String, List<LogicalExpression>, ExpressionPosition) line: 113	
      	ExpressionTreeMaterializer.addCastExpression(LogicalExpression, TypeProtos$MajorType, FunctionLookupContext, ErrorCollector, boolean) line: 235	
      	ExpressionTreeMaterializer$MaterializeVisitor(ExpressionTreeMaterializer$AbstractMaterializeVisitor).visitIfExpression(IfExpression, FunctionLookupContext) line: 638	
      	ExpressionTreeMaterializer$MaterializeVisitor(ExpressionTreeMaterializer$AbstractMaterializeVisitor).visitIfExpression(IfExpression, Object) line: 332	
      	IfExpression.accept(ExprVisitor<T,V,E>, V) line: 65	
      	ExpressionTreeMaterializer.materialize(LogicalExpression, Map<VectorAccessible,BatchReference>, ErrorCollector, FunctionLookupContext, boolean, boolean) line: 165	
      	ExpressionTreeMaterializer.materialize(LogicalExpression, VectorAccessible, ErrorCollector, FunctionLookupContext, boolean, boolean) line: 143	
      	ProjectRecordBatch.setupNewSchemaFromInput(RecordBatch) line: 482	
      	ProjectRecordBatch.setupNewSchema() line: 571	
      	ProjectRecordBatch(AbstractUnaryRecordBatch<T>).innerNext() line: 99	
      	ProjectRecordBatch.innerNext() line: 144	

      This appears to be processing the if statement in the following test query:

                    .sqlQuery("select case when is_bigint(field1) " +
                      "then field1 when is_list(field1) then field1[0] " +
                      "when is_map(field1) then t.field1.inner1 end f1 from cp.`jsoninput/union/a.json` t")

      The problem appears to be that a function says it takes data of type LATE, and then that data is added to the Union. Not sure of the exact solution, but simply omitting the LATE value from the Union seems to work.


