Wait on the lock or bump on the retry are the only "easy" bug fixes I can
think of. There are downsides to both.
wait on lock:
Currently we only have a single background thread to do these kinds of tasks,
it is single threaded through each task. So if you wait on lock, then every
background task waits until that can get finished. Stuff like checkpoints,
and other space reclamation tasks will wait. The queue may grow and become
Could lead to just cpu spinning and no actual guarantee of getting the resource
as it could keep getting unlucky and lose a cycle schedule when the resource
is actually available.
Bigger feature projects could do resolve these issues.
1) Enhance the background daemon schedule utility to support more than one
o I think it is important that we don't just randomly grow the number
of background threads, as we already have complaints about the one
background thread per db. But it would be fine to short term allocate
a thread and then destroy it.
o not sure exactly what is best, but in this case it would be nice to
be able to REQUE the post commit task to the daemon saying add it to the
WAIT allowed queue. At that point maybe the daemon starts a new thread
for each one or maybe for each N or something else. Maybe it is
o There are already other problems for which this may be the solution:
1) As number of cores/cpu's grow it becomes obvious that one background
thread to N possible concurrent user thread which could each generate
work is not correct.
2) There are some tasks that are even more critical than others that could
benefit from better priority. Things like checkpoint may want their
own thread rather than share with others.
2) The problem with space being lost "forever" (which really means until an
offline compress, in part stems from the row format of the overflow pieces.
o we could change the overflow pieces to have back pointers to the main
head page which would make it much easier to figure out a stranded piece
during online compress.
o We could write some sort of brute force search which could come up with
the stranded pieces and call reclaim on them. It is actually not that
hard of code to write if one doesn't worry about memory. It would go
something like - probably many ways to optimize it. :
for (every page at raw store level)
if (main page)
for every row on page
if (has an overflow piece)
add main row handle to main hash table, and overflow handle
if (overflow page)
for every row on page
add main row handle, and next overflow handle
for (every row in main hash table)
delete complete overflow chain from overflow hash table
I believe this leaves the disconnected chains in the overflow hash table,
with some graph algorithm to determine the head of the chains.
The above works for X lock on the whole table, I am not sure if row locking
works. If you have to get X lock then an offline compress is not that much
worse which is why it never got implemented.
3) This post commit work is all still really at the first implmentation level,
when most choices were what was simple and mostly work. Not much has
happened since then. Also in first implementation big CLOB's and BLOB's were
not even implemented so the downside of losing some stuff was not much.
As Kathey pointed out I think there are holes in a lot of the post commit
stuff where a crash can lose them. It seems the only way to make sure we
don't lose them is to somehow store them transactionally in the db, but at
what cost? We could implement some sort of internal table, and then store
rows with the post commit work.
Personally I think 1 and 2 are better paths than 3. It would be interesting
to know what other db's do. I am pretter sure postgress for instance rely's
on scheduling a "scrubber" to do a lot of what we do in post commit. I would
rather see us make changes that would allow online compress table to handle
these overflow losses and then enhance the post commit deamon to better handle
them asap without ever needing the compress table.