Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10425

count aggregation optimization inside one segment in log scenario

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • core/search
    • None
    • New

    Description

      In log scenario, we usually want to know the doc count of documents between every time intervals. One possible optimized method is to sort the docuemt in ascend order according to @timestamp field in one segment. then we can use    this pr https://github.com/apache/lucene/pull/687 to find out the min/max docId in on time interval.

      If there is no other filter query, the doc count of one time interval is (max docId- min docId +1)

      if there is only one another term filter query, we can use this pr https://github.com/apache/lucene/pull/688 to get the diff value of index, when we call advance(minId) and advance(maxId), the diff value is also the doc count of one time interval

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            wjp719 jianping weng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 4h 50m
                4h 50m