Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-1624

HyperLogLogPlusCounter will become inaccurate when there're billions of entries

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: v1.5.2
    • Component/s: None
    • Labels:
      None

      Description

      final List<HyperLogLogPlusCounter> counters = Lists.newArrayList();
      ExecutorService service = Executors.newFixedThreadPool(20);
      final CountDownLatch latch = new CountDownLatch(20);
      for (int i = 0; i < 20; i++) {

      service.submit(new Runnable() {
      @Override
      public void run() {
      Random rand = new Random();
      HyperLogLogPlusCounter counter = new HyperLogLogPlusCounter(14);
      for (long j = 0; j < 500000000; j++) {
      if (j % 1000000 == 1)

      { System.out.println(j); }

      counter.add("" + rand.nextLong());
      }
      System.out.println("final" + counter.getCountEstimate());
      counters.add(counter);
      latch.countDown();
      }
      });
      }
      latch.await();
      System.out.println("latch done");

      HyperLogLogPlusCounter ret = new HyperLogLogPlusCounter(14);
      for (HyperLogLogPlusCounter c : counters)

      { ret.merge(c); }

      System.out.println(ret.getCountEstimate());

      expected output is 10B however the output can be less than 1B

        Attachments

          Activity

            People

            • Assignee:
              liyang.gmt8@gmail.com liyang
              Reporter:
              mahongbin Hongbin Ma
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: