Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-10590

Indexing job downloads and creates FFS with full node store if includedPaths is specified as a string instead of array of strings

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Done
    • None
    • 1.62.0
    • indexing
    • None

    Description

      The includedPaths property of an index definition should be an array of strings.

      If it is instead specified as a String, like in this example:

              "includedPaths": "/a/b", 

      The indexing job defaults to using the / as the value for includedPaths, and therefore downloads the full node store and creates an FFS containing everything except the hidden paths. The logic that handles this case is here:

      https://github.com/apache/jackrabbit-oak/blob/0b8f4ab2e736c6561ae745a5fe6040a59911eeb3/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/spi/filter/PathFilter.java#L95-L103

      This will slow down significantly the indexing, as it will negate any benefits from using regex filtering. And even if regex filtering is not enabled or cannot be used, using / as includedPaths will also result in the FFS containing more nodes than it should, which will once again slow down the indexing job.

      Suggested fix: if includedPaths is a String, treat it as a one element array and at the same time log a warning.

      Additionally, apply the same fix to other properties in the index definition:

      • excludedPaths
      • includedPaths
      • queryPaths

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nuno.santos Nuno Santos
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: