[SOLR-1787] Add ability to configure behavior of cache miss to CachedSqlEntityProcessor - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Won't Fix
Affects Version/s: 1.5
Fix Version/s: None
Component/s: contrib - DataImportHandler
Labels:
- dih
Environment:

jdk 1.6.x, windows xp, tomcat 6.x

Description

The CachedSqlEntityProcessor currently builds a cache of rows it sees as it goes, so later requests for that same key can be served from data that has already been fetched. The primary query could be written to fetch all possible rows, which would then be set into the cache on the first request for a row. In that case the database would only receive another query when there is a cache miss. However, the query it would execute is the one that pulls all rows, negating any performance gain.

This patch adds the ability to configure behavior on cache miss with the "onCacheMiss" attribute on an "entity" tag in the data-config.xml file. The current behavior is the default, corresponding to the setting onCacheMiss="fill". Any other value explicitly given for onCacheMiss will cause cache misses to be ignored - no query will be made to the db to fulfill them.

I've encountered two cases where this capability is useful:

1. Relatively small datasets, such as category id -> category name mappings, which will not change during the course of indexing.
2. Queries which are heavy on db resources per-query, particularly if the query for an individual record is slow, and can't be fixed easily on the db side for whatever reason.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

solr-1787.patch
22/Feb/10 23:09
2 kB
Michael Henson

Activity

People

Assignee:: Unassigned

Reporter:: Michael Henson

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 22/Feb/10 23:08

Updated:: 04/Sep/12 14:33

Resolved:: 04/Sep/12 14:33