Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-28196

Preserve column stats when applying UDF upper/lower.

    XMLWordPrintableJSON

Details

    Description

      Current Hive re-estimates column stats (including avgColLen) when it encounters UDF.
      In the case of upper and lower, Hive sets avgColLen to hive.stats.max.variable.length.
      But these UDFs do not change column stats and the default value(100) is too high for string type key columns, on which upper/lower are usually applied.

      This patch keeps input data's avgColLen after applying UDF upper/lower to make a better query plan.

      Attachments

        Activity

          People

            seonggon Seonggon Namgung
            seonggon Seonggon Namgung
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: