Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9010

introduce Hive perf configuration utility

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Configuration
    • None

    Description

      Recently, many major perf features have been added (or are being added) to Hive, such as vectorization, CBO, Tez, Spark, etc.
      These are off by default, and customers using the Apache distribution may not be aware of them, and may not take advantage of all the speed Hive can offer.

      We can create a Hive perf configuration utility that will be able to set 6-10 important, easy-to-set settings. It can be used by admins or users when deploying Hive or on an existing cluster. Ideally all the no-brainer set-to-true settings would be there, with caveats, if any, described; some other ones may be, too, but we don't want to add any options for tuning because the whole point is to make it not confusing (as compared to editing the entire config file). Unless we have automatic tuning at some point, the users doing perf tuning can edit the config file manually (or use set) after reading the docs.

      Then we can mention it prominently in the docs and release notes. This should go a long way towards making sure users can utilize Hive to its full potential, without us enabling large/perf features by default, at least until they are stable (e.g. CBO can be enabled by default, so this tool may note that).

      Experimental feature settings (true/false or simple) can also be added in a separate section.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sershe Sergey Shelukhin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: