Here is an example
I turned off noconditionaltask. So, I expected that there will be 4 Map-only jobs for this query. However, I got 1 Map-only job (joining strore_sales and date_dim) and 3 MR job (for reduce joins.)
So, I checked the conditional task determining the plan of the join involving item. In ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask, aliasToFileSizeMap contains all input tables used in this query and the intermediate table generated by joining store_sales and date_dim. So, when we sum the size of all small tables, the size of store_sales (which is around 45GB in my test) will be also counted.