Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-1379

More stable and functional precise count distinct implements after KYLIN-1186

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: v1.3.0, v1.5.0
    • Fix Version/s: v1.5.3
    • Component/s: Job Engine
    • Labels:
      None

      Description

      After KYLIN-1186, we've gained the ability to count distinct Int type columns precisely.
      However, the implements of KYLIN-1186 is not stable, especially in 2.x-staging branch.
      The reason is that the measure's maxlength is used to allocate memory in 2.x version, and the BitmapMeasure is hardcoded to 8MB in KYLIN-1186, causing OOM when cube building.
      To resolve this problem, we have introduce precision on the bitmap measure, such as bitmap(100), bitmap(10000), bitmap(1000000), meaning the measure could accept 100/10000/1M cardinality at most. This solution should be fine, considering the reality, if the count value over 1000000, the hyperloglog measure which produce approx. result should be acceptable.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sunyerui Yerui Sun
                Reporter:
                sunyerui Yerui Sun
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: