accesses and perhaps to make it go faster.
I will have a look at it, I see as well in this test and in the global
profiling that a lot of time is spent on it.
There are two iterators in the class(kvsetIt and snapshotIt), and getLowest
compare the two to return the lowest. However, in this test, one of the list
is empty, so the value is null, and hence the real comparison on byte is
On this subject, there is a possible optimisation on the function "peek",
that will repeat the comparison: if peek is called multiple time, of if we
often have peek() then next(), we can save the redundant comparisons. To me,
it makes sense to precalculate the value returned by "peek", and reuse it in
The profiling (method: sampling, java inlining desactivated) says something
Name; total time spent
So we're spending 26% of the time on this:
And in this getThreadReadPoint(), the actual time is spent in:
It's a TLS, so we can expect a system call to get the thread id. It would be
great to save this system call in a next().
There is at least an improvement for the case when one of the list is done:
don't get the data getThreadReadPoint(). That would not change the behaviour
at all, but would already be interesting (may be 10% in this test).
Another option is to share getThreadReadPoint() value for the two iterators,
i.e. read the value in the next() function, and give it as a parameter to
getNext(). In fact, as this value seems to be a TLS, I don't see how it
could change during the execution of next(). What do you think?
Last question on this: what is the use case when the getThreadReadPoint()
will change during the whole scan (i.e.: between next)?
Most of the public methods (except reseek) are "synchronized", it implies
that the scanner can be shared between threads?
At the end, it seems that there are 3 possible things to do:
1) Replacement of KeyValue lowest = getLowest();
2) theNext precalculation for peek() and next()
3) Depending on your feedback, one of the options above on
This should give 5 to 15% increase in performances, not a "problem solved"
stuff, but could justify a first patch. I can do it (with the hbase
On Sun, Jul 24, 2011 at 12:23 AM, stack (JIRA) <email@example.com> wrote: