Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
To be more protective to select in-mem cubing.
There were reports that in-mem cubing was selected but run slowly (even timeout) in a few small PoC clusters. Apart from the limited resource in PoC, another important cause is the algorithm selection didn't concern the scale of the job. When mappers are many but mapper slots are few, it will take many rounds to finish all the mappers, and the total time of in-mem cubing becomes slow.
The scale of the job should be taken into consideration when auto selecting cubing algorithm. Namely a mapper limit configuration is introduced, to stop in-mem cubing when the number of mappers goes beyond the threshold.
Attachments
Issue Links
- relates to
-
KYLIN-1656 Improve performance of MRv2 engine by making each mapper handles a configured number of records
- Closed