Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5200

Add REST support for reading and modifying Solr configuration

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 5.1
    • None
    • None

    Description

      There should be a REST API to allow full read access to, and write access to some elements of, Solr's per-core and per-node configuration not already covered by the Schema REST API: solrconfig.xml/core.properties/solrcore.properties and solr.xml/solr.properties (SOLR-4718 discusses addition of solr.properties).

      Use cases for runtime configuration modification include scripted setup, troubleshooting, and tuning.

      Tentative rules-of-thumb about configuration items that should not be modifiable at runtime:

      1. Startup-only items, e.g. where to start core discovery
      2. Items that are deprecated in 4.X and will be removed in 5.0
      3. Items that if modified should be followed by a full re-index

      Some issues to consider:

      Persistence: How (and even whether) to handle persistence for configuration modifications via REST API is not clear - e.g. persisting the entire config file or having one or more sidecar config files that get persisted. The extent of what should be modifiable will likely affect how persistence is implemented. For example, if the only solrconfig.xml modifiable items turn out to be plugin configurations, an alternative to full-solrconfig.xml persistence could be individual plugin registration of runtime config modifiable items, along with per-plugin sidecar config persistence.

      "Live" reload: Most (if not all) per-core configuration modifications will require core reload, though it will be a "live" reload, so some things won't be modifiable, e.g. <dataDir> and IndexWriter related settings in <indexConfig> - see SOLR-3592. (Should a full reload be supported to handle changes in these places?)

      Interpolation aka property substitution: I think it would be useful on read access to optionally return raw values in addition to the interpolated values, e.g. solr.xml hostPort raw value ${jetty.port:8983} vs. interpolated value 8983. Modification requests will accept raw values - property interpolation will be applied. At present interpolation is done once, at parsing time, but if property value modificationmvn archetype:generate -DgroupId=com.mkyong.core -DartifactId=ProjectName
      -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=falsemvn archetype:generate -DgroupId=com.mkyong.core -DartifactId=ProjectName
      -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=falsemvn archetype:generate -DgroupId=com.mkyong.core -DartifactId=ProjectName
      -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false is supported via the REST API, an alternative could be to delay interpolation until values are requested; in this way, property value modification would not trigger re-parsing the affected configuration source.

      Response format: Similarly to the schema REST API, results could be returned in XML, JSON, or any other response writer's output format.

      Transient cores: How should non-loaded transient cores be handled? Simplest thing would be to load the transient core before handling the request, just like other requests.

      Below I provide an exhaustive list of configuration items in the files in question and indicate which ones I think could be modifiable at runtime. I don't mean to imply that these must all be made modifiable, or for those that are made modifiable, that they must be made so at once - a piecemeal approach will very likely be more appropriate.

      solrconfig.xml

      Note that XIncludes and includes via Document Entities won't survive a modification request (assuming persistence is via overwriting the original file).

      XPath under /config/ Should be modifiable via REST API? Rationale Description
      luceneMatchVersion No Modifying this should be followed by a full re-index Controls what version of Lucene various components of Solr adhere to
      lib Yes Required for adding plugins at runtime Contained jars available via classloader for solrconfig.xml and schema.xml
      dataDir No Not supported by "live" RELOAD Holds all index data
      directoryFactory No Not supported by "live" RELOAD index directory factory
      codecFactory No Modifying this should be followed by a full re-index index codec factory, per-field SchemaCodecFactory by default
      schemaFactory Partial Although the class shouldn't be modifiable, it should be possible to modify an already Managed schema's mutability Managed or Classic (non-mutable) schema factory
      indexConfig No IndexWriter-related settings not supported by "live" RELOAD low-level indexing behavior
      jmx Yes   Enables JMX if an MBeanServer is found
      updateHandler@class No   Defaults to DirectUpdateHandler2
      updateHandler/updateLog No   Enables a transaction log, configures its directory and synchronization
      updateHandler/autoCommit Yes   Durability: enables hard autocommit, configures max interval and whether to open a searcher afterward
      updateHandler/autoSoftCommit Yes   Visibility: enables soft autocommit, configures max interval
      updateHandler/commitWithin/softCommit Yes   Whether commitWithin update request param should trigger a soft commit instead of hard commit
      updateHandler/listener Yes   Update-related event listeners, e.g. snapshooter
      indexReaderFactory No   Specify custom index reader factory (default StandardIndexReaderFactory)
      query/maxBooleanClauses Yes   Maximum boolean clauses allowed in a query
      query/filterCache Yes   Enables the filter cache - unordered docsets, configures class, initial size, max size, and entries to pull from an old cache
      query/queryResultCache Yes   Enables the query result cache - ordered docid lists,configures class, initial size, max size, and entries to pull from an old cache
      query/documentCache Yes   Enables the document cache - document stored fields, configures class, initial size, and max size
      query/fieldValueCache Yes   Enables the field value cache - field values by docid, created by default, configures class, size, # entries to report stats for (showItems)
      query/cache Yes   Enables a custom cache, configures name, class, initial size, max size, regenerator class, and entries to pull from an old cache
      query/enableLazyFieldLoading Yes   Whether to enable lazy field loading
      query/useFilterForSortedQuery Yes   Whether to use a filter for a sorted non-scoring search
      query/queryResultWindowSize Yes   Cached result window size
      query/queryResultMaxDocsCached Yes   Maximum number of documents to cache for any entry in the queryResultCache
      query/listener Yes   Query-related event listener, configures event type, class, and queries, e.g. newSearcher and firstSearcher events with solr.QuerySenderListener
      query/useColdSearcher Yes   Whether to interrupt searcher warming to service a query request if there are no registered searchers
      query/maxWarmingSearchers Yes   Max searchers to warm
      requestDispatcher Yes   Configures SolrDispatchFilter behavior, Including requestParsers and httpCaching
      requestHandler Yes   Configures request handlers, including SearchHandler, RealTimeGetHandler, UpdateRequestHandler, ReplicationHandler, etc., and their URL path mapping (name)
      searchComponent Yes   Configures search components available to SearchHandlers
      updateRequestProcessorChain Yes   Configures named update request processor chains usable by UpdateRequestHandler
      queryResponseWriter Yes   Configures named response writers
      queryParser Yes   Configures query parser plugins
      valueSourceParser Yes   Configures named function parsers, usable by the "func" QParser
      transformer Yes   Configures named document transformers, which transform documents returned to the user, e.g. adding fields - defaults are explain, value, shard, docid
      admin/defaultQuery No   Legacy config for the admin UI

      core.properties

      core.properties marks a core directory. Each core will parse its solrconfig.xml using these properties.

      I don't think any of the Solr-internal properties in this file should be modifiable at runtime: "name", "config", "instanceDir", "absoluteInstDir", "dataDir", "ulogDir", "schema", "shard", "collection", "roles", "properties", "loadOnStartup", "transient", "coreNodeName". But it would be useful to allow for addition/modification of user-defined properties here.

      Read/write access will be provided, both for individual properties and in bulk. solrconfig.xml will need to be re-parsed using new property values; alternatively, interpolation could be delayed until values are accessed. Problem: changing properties that aren't valid in a "live" RELOAD - see SOLR-3592.

      solrcore.properties

      solrcore.properties is a per-config-set properties map used to interpolate property values when parsing solrconfig.xml.

      Read/write access will be provided, both for individual properties and in bulk. solrconfig.xml will need to be re-parsed using new property values; alternatively, interpolation could be delayed until values are accessed. Problem: changing properties that aren't valid in a "live" RELOAD - see SOLR-3592.

      solr.xml

      solr.xml is used to configure multi-core and SolrCloud features.

      Most of the configuration items in this file are related to startup-only operations, and so shouldn't be changed at runtime.

      XPath under /solr/ (4.X old-style) XPath under /solr/ (5.0 and 4.4+ core discovery style) Should be modifiable via REST API? Description/rationale
      @persistent N/A No Deprecated in 4.X old-style, removed in 5.0 and 4.4+ core discovery style
      cores/@defaultCoreName N/A No Deprecated in 4.X old-style, removed in 5.0 and 4.4+ core discovery style
      cores/@adminPath N/A No Removed in 5.0, where it's always /admin/cores
      N/A str[@name='coreRootDirectory'] No The root of the core discovery tree, defaults to the solrhome
      @coreLoadThreads int[@name='coreLoadThreads'] Yes Core loading fixed thread pool size
      @sharedLib str[@name='sharedLib'] No Lib directory used by all cores on the same node
      cores/@adminHandler str[@name='adminHandler'] No Admin handler class, CoreAdminHandler by default
      cores/@managementPath str[@name='managementPath'] No Request URL path prefix that gets stripped by SolrDispatchFilter
      cores/@shareSchema str[@name='shareSchema'] No Whether to cache and share schema object among cores on the same node
      cores/@transientCacheSize int[@name='transientCacheSize'] Yes Max active transient cores; reducing this would trigger immediate unloading
      cores/shardHandlerFactory shardHandlerFactory No Shard handler factory class and configuration
      logging/@class logging/str[@name='class'] No Logging class
      logging/@enabled logging/str[@name='enabled'] Yes Whether to enable logging
      logging/watcher/@size logging/watcher/int[@name='size'] Yes Max log history entries
      logging/watcher/@threshold logging/watcher/int[@name='threshold'] No Root logger level; per-logger level setting already available through LoggingHandler via the /admin/logging endpoint
      @zkHost solrcloud/str[@name='zkHost'] No SolrCloud: ZooKeeper host holding cluster state
      cores/@distribUpdateConnTimeout solrcloud/int[@name='distribUpdateConnTimeout'] No SolrCloud: initial distributed update connection timeout
      cores/@distribUpdateSoTimeout solrcloud/int[@name='distribUpdateSoTimeout'] No SolrCloud: distributed update socket read timeout
      cores/@host solrcloud/str[@name='host'] No SolrCloud: Local Solr host name
      cores/@hostContext solrcloud/str[@name='hostContext'] No SolrCloud: Local Solr servlet context path
      cores/@hostPort solrcloud/int[@name='hostPort'] No SolrCloud: Local Solr host port
      cores/@leaderVoteWait solrcloud/int[@name='leaderVoteWait'] No SolrCloud: Leader vote wait time (ms)
      cores/@genericCoreNodeNames solrcloud/bool[@name='genericCoreNodeNames']" No SolrCloud: If true, don't base core node names on the node address
      cores/@zkClientTimeout solrcloud/int[@name='zkClientTimeout'] No SolrCloud: ZooKeeper connection timeout

      solr.properties

      Contains local per-node (not in ZooKeeper) properties used to parse solr.xml.

      Read/write access will be provided, both for individual properties and in bulk. solr.xml will need to be re-parsed using new property values.

      Attachments

        Issue Links

          Activity

            People

              noble.paul Noble Paul
              sarowe Steven Rowe
              Votes:
              4 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: