Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-1379

More stable and functional precise count distinct implements after KYLIN-1186

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • v1.3.0, v1.5.0
    • v1.5.3
    • Job Engine
    • None

    Description

      After KYLIN-1186, we've gained the ability to count distinct Int type columns precisely.
      However, the implements of KYLIN-1186 is not stable, especially in 2.x-staging branch.
      The reason is that the measure's maxlength is used to allocate memory in 2.x version, and the BitmapMeasure is hardcoded to 8MB in KYLIN-1186, causing OOM when cube building.
      To resolve this problem, we have introduce precision on the bitmap measure, such as bitmap(100), bitmap(10000), bitmap(1000000), meaning the measure could accept 100/10000/1M cardinality at most. This solution should be fine, considering the reality, if the count value over 1000000, the hyperloglog measure which produce approx. result should be acceptable.

      Attachments

        Issue Links

          Activity

            People

              sunyerui Yerui Sun
              sunyerui Yerui Sun
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: