Jackrabbit Content Repository
  1. Jackrabbit Content Repository
  2. JCR-3147

AbstractIndex should use the repository executor for indexing tasks

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: jackrabbit-core
    • Labels:
      None

      Description

      The AbstractIndex still uses its own executor for indexing tasks, it should switch to the global repository executor.

      1. JCR-3147.patch
        10 kB
        Alex Parvulescu

        Issue Links

          Activity

          Jukka Zitting made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Marcel Reutegger added a comment -

          It would also be good to understand what the behaviour is when you re-index a workspace that contains a lot of files that trigger text extractions.

          Show
          Marcel Reutegger added a comment - It would also be good to understand what the behaviour is when you re-index a workspace that contains a lot of files that trigger text extractions.
          Hide
          Marcel Reutegger added a comment -

          > Marcel, do you think the current tests cover this indexing issue

          Probably not. To me it is not clear how the system will behave when the pool is busy with index merges and then additional tasks like these in the AbstractIndex are added. I guess the save would be blocked for a while until a thread is available again. This is a slightly different behaviour than before.

          Show
          Marcel Reutegger added a comment - > Marcel, do you think the current tests cover this indexing issue Probably not. To me it is not clear how the system will behave when the pool is busy with index merges and then additional tasks like these in the AbstractIndex are added. I guess the save would be blocked for a while until a thread is available again. This is a slightly different behaviour than before.
          Hide
          Alex Parvulescu added a comment -

          I have updated the issue's description and removed the references to low priority tasks.

          Tasks submitted by the AbstractIndex will have normal priority, which means they will be executed asap.
          Plus the global executor now benefits from a few dedicated threads that can pickup the work load immediately (the low priority tasks occupy up to 75% of the threads OOTB).

          Marcel, do you think the current tests cover this indexing issue, or would you like to investigate some more before applying the patch?

          Show
          Alex Parvulescu added a comment - I have updated the issue's description and removed the references to low priority tasks. Tasks submitted by the AbstractIndex will have normal priority, which means they will be executed asap. Plus the global executor now benefits from a few dedicated threads that can pickup the work load immediately (the low priority tasks occupy up to 75% of the threads OOTB). Marcel, do you think the current tests cover this indexing issue, or would you like to investigate some more before applying the patch?
          Alex Parvulescu made changes -
          Description The AbstractIndex still uses its own executor for indexing tasks, it should switch to the global repository executor.

          It should also mark the tasks as 'low priotiry' as soon as JCR-3146 gets into the trunk.
          The AbstractIndex still uses its own executor for indexing tasks, it should switch to the global repository executor.
          Hide
          Marcel Reutegger added a comment -

          > should we reconsider the patch now?

          The indexing tasks in this issue should not be low priority. I'm OK with a solution that uses the repository thread pool, but uses the current thread if the pool is busy.

          Show
          Marcel Reutegger added a comment - > should we reconsider the patch now? The indexing tasks in this issue should not be low priority. I'm OK with a solution that uses the repository thread pool, but uses the current thread if the pool is busy.
          Hide
          Marcel Reutegger added a comment -

          > can (and should) also be used for immediate execution of parallel tasks

          I agree that it should but currently it can't, because all tasks are treated equally.

          I think an important feature that is missing currently, is the ability to specify a 'queue policy'. A client should be able to specify if it is OK to queue the task and execute it later when all the threads in the pool are busy. This would be similar to the RejectedExecutionHandler but would already kick in when the the pool is busy.

          Show
          Marcel Reutegger added a comment - > can (and should) also be used for immediate execution of parallel tasks I agree that it should but currently it can't, because all tasks are treated equally. I think an important feature that is missing currently, is the ability to specify a 'queue policy'. A client should be able to specify if it is OK to queue the task and execute it later when all the threads in the pool are busy. This would be similar to the RejectedExecutionHandler but would already kick in when the the pool is busy.
          Hide
          Alex Parvulescu added a comment -

          thanks for joining Jukka

          I'm with Jukka on this one. Having a global executor will simplify things and I think the work on JCR-3146 should address the priority tasks concern.

          should we reconsider the patch now?

          Show
          Alex Parvulescu added a comment - thanks for joining Jukka I'm with Jukka on this one. Having a global executor will simplify things and I think the work on JCR-3146 should address the priority tasks concern. should we reconsider the patch now?
          Hide
          Jukka Zitting added a comment -

          > parallelize a number of tasks given the number of available cores

          That's also what the repository thread pool does. In addition it supports scheduling of tasks for execution in the future it can (and should) also be used for immediate execution of parallel tasks.

          Having just a single thread pool for the repository simplifies things (even after the added concern of keeping some threads available for high-priority tasks), which is why IMHO we should do this.

          Show
          Jukka Zitting added a comment - > parallelize a number of tasks given the number of available cores That's also what the repository thread pool does. In addition it supports scheduling of tasks for execution in the future it can (and should) also be used for immediate execution of parallel tasks. Having just a single thread pool for the repository simplifies things (even after the added concern of keeping some threads available for high-priority tasks), which is why IMHO we should do this.
          Alex Parvulescu made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Hide
          Alex Parvulescu added a comment -

          cool, now I understand. I'll close this issue.

          Show
          Alex Parvulescu added a comment - cool, now I understand. I'll close this issue.
          Hide
          Marcel Reutegger added a comment -

          The usage of DynamicPooledExecutor in AbstractIndex is a bit special because it is not meant to execute tasks at some point in the future, but just parallelize a number of tasks given the number of available cores. It is important that the tasks are scheduled and executed rather immediately. Otherwise the current transaction will be delayed unnecessarily.

          Show
          Marcel Reutegger added a comment - The usage of DynamicPooledExecutor in AbstractIndex is a bit special because it is not meant to execute tasks at some point in the future, but just parallelize a number of tasks given the number of available cores. It is important that the tasks are scheduled and executed rather immediately. Otherwise the current transaction will be delayed unnecessarily.
          Alex Parvulescu made changes -
          Attachment JCR-3147.patch [ 12503623 ]
          Hide
          Alex Parvulescu added a comment -

          attached proposed patch

          Show
          Alex Parvulescu added a comment - attached proposed patch
          Hide
          Alex Parvulescu added a comment - - edited

          This is only a soft link, the patch is pretty much standalone.

          Show
          Alex Parvulescu added a comment - - edited This is only a soft link, the patch is pretty much standalone.
          Alex Parvulescu made changes -
          Field Original Value New Value
          Link This issue requires JCR-3146 [ JCR-3146 ]
          Alex Parvulescu created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Alex Parvulescu
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development