Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-8523

Best Practices - Property Value Length Limit



    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core, jcr
    • Labels:


      Right now, Oak supports very large properties (e.g. String). But 1 MB (or larger) properties are problematic in multiple areas like indexing. It is more important for software-as-a-service, where we need to guarantee SLOs, but it also helps other cases. So we should:

      • (1) Document best practises, e.g. "Property values should be smaller than 100 KB".
      • (2) Introduce "softLimit" and "hardLimit", where softLimit is e.g. 100 KB and hardLimit is configurable, and (initially) by default Integer.MAX_VALUE. Setting the hard limits to a lower value by default is problematic, because it can break existing applications. With default value infinity, customers can set lower limits e.g. in tests first, and once they are happy, in production as well.
      • (3) Log a warning if a property is larger than "softLimit". To avoid logging many warnings (if there are many such properties) we then set softLimit = softLimit * 1.1 (reset to 100 KB in the next repository start). Logging is needed to know what exactly is broken (path, stack trace of the actual usage...)
      • (4) Add a metric (monitoring) for detected large properties. Just logging warnings might not be enough.
      • (5) Throttling: we could add flow control (pauses; Thread.sleep) after violations, to improve isolation (to prevent affecting other threads that don't violate the contract).
      • (6) We could expose the violation info in the session, so a framework could check that data after executing custom code, and add more info (e.g. log).
      • (7) If larger than the configurable hardLimit, fail the commit or reject setProperty (throw an exception).
      • (8) At some point, in a new Oak version, change the default value for hardLimit to some reasonable number, e.g. 1 MB.

      The "property length" is just one case. There are multiple candidates:

      • Number of properties for a node
      • Number of elements for multi-valued properties
      • Total size of a node (including inlined properties)
      • Number of direct child nodes for orderable child nodes
      • Number of direct child nodes for non-orderable child nodes
      • Size of transaction
      • Adding observations listeners that listen for all changes (global listeners)

      For those cases, new Jira issue should be made.




            • Assignee:
              thomasm Thomas Mueller
            • Votes:
              0 Vote for this issue
              3 Start watching this issue


              • Created: