Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4351

RelMdUtil#numDistinctVals always returns 0 for large inputs

    XMLWordPrintableJSON

Details

    Description

      Previous implementation of RelMdUtil#numDistinctVals uses the approximation ln(1 + x) ~= x when x is small.

      However CALCITE-4132 remove this approximation to make the result more accurate. This causes the function to calculate an incorrect result for large inputs (for example, when domainSize = 1e18 and numSelected = 1e10 the result is 0) due to precision problems.

      What I would suggest is to treat small and large inputs in different ways. For small inputs we use the new, more precise function and for large inputs we use the old, approximated function.

      Attachments

        Issue Links

          Activity

            People

              fan_li_ya Liya Fan
              TsReaper Caizhi Weng
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m