It used to be that something like the script below was thrown out by the parser:
filtered = filter my_relation by my_relation.x > 2;
We now try to actually evaluate that, treating my_relation as a scalar to be loaded up while iterating over my_relation.
We should instead suggest that the user probably wanted to write
filtered = filter my_relation by x > 2;
A similar problems occurs in this code:
joined = join a by id, b by id;
projected = foreach joined generate a.id;
Naturally, the user actually meant
projected = foreach joined generate a::id;
Instead of erroring out, we currently generate massive plans that involve lots of splits (I saw a 5-line script filled with this syntax mistake generate 12 jobs!), and fail eventually with "Scalar has more than one row in the output" – which doesn't help a user who is not advanced enough to know about Scalars.
This is extra confusing to people coming from a SQL background, who are of course extremely used to referring to their tables' fields this way.