Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9791

Monitor (aka Luwak) has concurrency issues related to BytesRefHash#find

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 9.0, 8.7, 8.8
    • 9.0, 8.9
    • core/other
    • None
    • New

    Description

      org.apache.lucene.monitor.Monitor can sometimes NOT match a document that should be matched by one of registered queries if match operations are run concurrently from multiple threads. 

      This is because sometimes in a concurrent environment TermFilteredPresearcher might not select a query that could later on match one of documents being matched.

      Internally TermFilteredPresearcher is using a term acceptor: an instance of org.apache.lucene.monitor.QueryIndex.QueryTermFilterQueryTermFilter is correctly initialized under lock and its internal state (a map of org.apache.lucene.util.BytesRefHash instances) is correctly published. Later one when those instances are used concurrently a problem with org.apache.lucene.util.BytesRefHash#find is triggered since it is not thread safe.

      org.apache.lucene.util.BytesRefHash#find internally is using a private org.apache.lucene.util.BytesRefHash#equals method, which is using an instance field scratch1 as a temporary buffer to compare its ByteRef parameter with contents of ByteBlockPool. This is not thread safe and can cause incorrect answers as well as ArrayOutOfBoundException

      __

       

      Attachments

        1. LUCENE-97910-8.x-backport.patch
          5 kB
          Paweł Bugalski
        2. LUCENE-9791.patch
          4 kB
          Paweł Bugalski
        3. LUCENE-9791_example.patch
          2 kB
          Robert Muir

        Activity

          People

            Unassigned Unassigned
            pbugalski_dynatrace Paweł Bugalski
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 4h 20m
                4h 20m