Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Although fs based collection mechanism is default for last few releases, tests still use jdbc for stats collection. The main advantage of fs based collection over jdbc based one is the scalability. In jdbc case, a single database (normally co-located with the metastore relational database) is used to handle all the stats collected by all the tasks. This single database is responsible to maintain the consistency for the stats, which will become a bottleneck and face scalability issue when the number of tasks is huge. In fs case, each task is writing stats into hdfs which does not have scalability issue.
Attachments
Attachments
Issue Links
- links to