Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-2328

Query timeouts failing when plan phase is long

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • Jena 4.6.0
    • ARQ
    • None

    Description

      In a production service with a large TDB store (around 500MT) we find that some complex queries evade the query timeouts (set to 90s first result, 120s total) and then run for hours soaking up all available CPU cores. While the queries show no clear pattern, and it has been hard replicate in a controlled setting, we do now have one example which is expressible as a test case. See attached.

      The behaviour is that the abort() call from the alarm timeout is received by QueryExecDataset before there is an iterator to cancel - the QueryExecDataset instance is deep in getPlan() which itself executes part of the query. In the specific example it's OpSlice which is iterating through the offset while still in the planning phase. Though not  queries which cause this sort of behaviour use offsets.

      Sorry but have no PR to offer at this stage. Have looked at whether it's possible to have getPlan() return some future or deferrable plan so that the top level exec has a handle on something that it can abort. However, the changes looks far reaching and I don't yet have a satisfactory approach to offer.

       

       

      Attachments

        1. TestQueryExecutionTimeout3.java
          3 kB
          Dave Reynolds

        Issue Links

          Activity

            People

              andy Andy Seaborne
              der Dave Reynolds
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: