Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12010

Tests should use FileSystem based stats collection mechanism

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • Statistics
    • None

    Description

      Although fs based collection mechanism is default for last few releases, tests still use jdbc for stats collection. The main advantage of fs based collection over jdbc based one is the scalability. In jdbc case, a single database (normally co-located with the metastore relational database) is used to handle all the stats collected by all the tasks. This single database is responsible to maintain the consistency for the stats, which will become a bottleneck and face scalability issue when the number of tasks is huge. In fs case, each task is writing stats into hdfs which does not have scalability issue.

      Attachments

        1. HIVE-12010.patch
          1 kB
          Ashutosh Chauhan
        2. HIVE-12010.1.patch
          158 kB
          Ashutosh Chauhan
        3. HIVE-12010.2.patch
          159 kB
          Ashutosh Chauhan
        4. HIVE-12010.3.patch
          159 kB
          Ashutosh Chauhan
        5. HIVE-12010.4.patch
          162 kB
          Ashutosh Chauhan

        Issue Links

          Activity

            People

              ashutoshc Ashutosh Chauhan
              ashutoshc Ashutosh Chauhan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: