It improves here, but with a bug. I did the test in a 25-nodes cluster which such script
A = load '/tpch/orders' USING PigStorage('\u0001') AS (o_orderkey:int, o_custkey:int, o_orderstatus:chararray, o_totalprice:double, o_orderdate:chararray, o_orderpriority:chararray, o_clerk:chararray, o_shippriority:int, o_comment: chararray);
F = FOREACH A GENERATE o_orderkey;
L = LIMIT F 10;
||job cost time
||HDFS bytes read
||Average time taken by Map tasks
||Worst performing map task
|| 26 sec
|| 1 sec
|| 24 sec
|| 5 sec
Since with your patch, the LimitOptimizer would remove LOLimit from logic plans after set the limit to LOLoad, this would generate a map-only job. Record number of the result would be map_num * 10, this is incorrect.
I will submit a patch soon.