Description
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the FOR ALL COLUMNS) some queries became really slow. For example query24 - https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql takes between 10~15min before running the ANALYZE TABLE.
After running ANALYZE TABLE I waited 24h before cancelling the execution.
If I disable spark.sql.cbo.joinReorder.enabled or
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table stats, but not column stats.
Rows Count:
store_sales - 2879966589
store_returns - 288009578
store - 1002
item - 300000
customer - 12000000
customer_address - 6000000