Attached is a prototype of another attempt at implementing auto-update
of Derby index statistics. First I'll describe the patch briefly, then
I'll note some potential improvements and ideas.
I've omitted lots of details, feel free to ask questions and to comment
on the suggested improvements etc. They need a lot more work...
The code is nowhere near complete, its primary purpose is to spur
discussion and hopefully guide us in the right direction.
The prototype performs some checks for whether the index statistics are
stale during statement compilation, as Mamta did under
the statistics are considered stale, an update job to update all indexes
for the base table is scheduled with a "daemon". The daemon keeps track
of scheduled update jobs, and will execute them in a separate thread.
Only one job will be taken care of at a time, and if there are too many
jobs, new jobs are discarded. When a slot frees up in the work queue,
these jobs will eventually be scheduled. If there are no statistics,
creating them will be scheduled (the daemon doesn't separate between
creating and updating stats). When a job is scheduled for a base table,
this is recorded in the associated index descriptors (transient state)
to avoid having to query the daemon too often.
As mentioned, the work is carried out in a separate thread, created as
required (there is no permanent background thread, it dies if the queue
is emptied). This seems appropriate as statistics update should be
rather infrequent compared to other operations in a database system.
When new statistics are computed for the indexes of a table, they are
stored in the daemon. They require little memory (table identifier, and
per index, the index identifer, two longs and one int).
As a statement is compiled, the optimizer will consider the available
indexes. At this point the index statistics are checked, and if we see
that they have been scheduled we make sure we check if they are
completed a little later in the compilation process. If we find new
statistics for the query being compiled, we also write any other
completed statistics to the data dictionary. Writing to the data
dictionary is currently done with a nested read-write user transaction
in the user transaction (during statement compilation) - mainly to avoid
keeping locks for an extended period of time.
For clarity, statement compilation/execution will not wait for new
statistics to be generated. In the case of large tables, it could take
hours to generate new stats.
o code organization (I don't know the code well) - choices made based on
what worked and on reducing overhead (i.e., checking indexes when we
have already obtained handles to them)
o the async/decoupled data dictionary update - done to avoid having to
create a LanguageConnectionContext (lcc).
o logic/thresholds for determining when stats are stale
o the row estimate logic also has weaknesses (for instance when mixing
setting absolute values and updating the estimate based on deltas)
Other notes/characteristics of the prototype:
o stats not generated/updated for system tables (caused locking problems)
o lower limit on the row estimate (don't generate for tables with few rows)
o I considered to expose the NO_WAIT option in the call to add new
descriptors to the data dictionary. Don't know if this is needed if we
update stats with a separate transaction from the daemon, then we can
either use TransactionControl.setNoLockWait() or maybe even just wait?
o current staleness code is dependent on reasonable row estimates
o the "unit of work" is currently a base table - when scheduled all
associated index statistics will be regenerated.
o I suspect that most tests in suites.All run with the DBO as the user,
and I haven't done anything specific to handle missing privileges.
Runs suites.All and derby.all with only four failures, all in
OrderByAndSortAvoidance. The tests fail on an assert for whether a table
scan is performed. To me it looks like the new stats makes the
compiler/optimizer choose a different plan (not necessarily better in
terms of pages visited though, but that's a DBA/optimizer issue).
Currently two flags control the prototype behavior:
If you grep for 'istat' in derby.log, you should get all the lines
relevant to automatic index statistics update.
o update data dictionary from the daemon thread
(must then be able to create an appropriate lcc)
o drift in the number of unique values isn't handled.
Some potential remedies (raw ideas):
Mechanism Distinct value drift Row count change
(a) compilation check N Y
(b) timed check N Y
(c) timed unconditional update Y Y
(d) UPDATE table SET ... y N
(a) creates statistics when not existing and kicks off the update job
as soon as stale we believe we should have had better stats. (b) helps
systems which are in a steady state (all statements compiled and
reused) - would typically check all user tables with indexes and
perform the staleness check from (a). (c) would help against
"anything" - but potentially with a large delay. Only useful for
applications where the database is up for very long periods of time
(days, weeks, months). Intervals for (b) and (c) would have to be
configurable. Mechanism (d) would help for updates changing a large
percentage of the rows, but would not catch many small updates
changing the selectivity of an index.
It may be possible to reuse BasicDaemon for the timed checks
(scheduling only, work would still be performed in a separate thread).
o do we need to throttle (a) the index scans, or (b) the processing
rate of the scheduled jobs?
(I started playing with a crude utilization rate)
o almost as above, but we should take care to avoid "infinite-loops"
o at which point may a change in either the number of rows or the field
values be big enough to warrant a recalculation of the stats?
What's more costly; a sub-optimal plan or reading all the data?
I'll be away for some weeks, but plan to return to this issue when I'm back.
My next steps depends on the feedback I get, but one way forwards may be
to try to do the data dictionary update from the daemon itself. Once we get
the core framework in place, we can start working on all the various issues
that have to be addressed.