Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-943

Approximate TopN supported by Cube

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: v1.4.0
    • Component/s: None
    • Labels:
      None

      Description

      SpaceSaving (TopN algorithm) code could copy from https://github.com/addthis/stream-lib/blob/master/src/main/java/com/clearspring/analytics/stream/StreamSummary.java
      We don’t need the whole stream-lib, but just one (or two) classes is enough. Make sure you give credit to stream-lib in class comment.

      In order to run SpaceSaving in parallel, the TopN has to be merged using http://arxiv.org/pdf/1401.0702.pdf. No existing impl as I searched, we have to implement ourselves.

      Cheers
      Yang

      From: Li, Yang
      Sent: 2015年8月7日 12:43
      To: DL-eBay-Kylin
      Subject: Distributed TopN papers

      The basic algorithm
      [1] https://icmi.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf

      Its application in distributed system
      [2] http://www.cs.utah.edu/~jeffp/papers/merge-summ-TODS.pdf
      [3] http://www.crm.umontreal.ca/pub/Rapports/3300-3399/3322.pdf

      Cheers
      Yang

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                shaofengshi Shao Feng Shi
                Reporter:
                mahongbin Hongbin Ma
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: