SA Bugzilla – Bug 5661
[review] speed up SQL queries by utilizing indexes
Last modified: 2007-12-16 13:30:16 UTC
The SQL query in /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 243 calculates the expire delta, but in a way that can't use an index: my $sql = "SELECT count(*) FROM bayes_token WHERE id = ? AND (? - atime) > ?"; Changing this as follows would make it so it can utilize the index, thus examining less rows (which could be locked). Mysql can't look up $newest_atime - atime from the index because that value is made up, so it uses the "id" portion of the index on (id, atime)... if it doesn't have to calculate the value for $newest_atime + $something for every row and only needs to check the index then it would be much faster. Index: lib/Mail/SpamAssassin/BayesStore/SQL.pm =================================================================== --- lib/Mail/SpamAssassin/BayesStore/SQL.pm (revision 579950) +++ lib/Mail/SpamAssassin/BayesStore/SQL.pm (working copy) @@ -241,7 +241,7 @@ my $sql = "SELECT count(*) FROM bayes_token WHERE id = ? - AND (? - atime) > ?"; + AND atime < ?"; my $sth = $self->{_dbh}->prepare_cached($sql); @@ -251,7 +251,7 @@ } for (my $i = 1; $i <= $max_expire_mult; $i<<=1) { - my $rc = $sth->execute($self->{_userid}, $newest_atime, $start * $i); + my $rc = $sth->execute($self->{_userid}, $newest_atime - $start * $i); unless ($rc) { dbg("bayes: calculate_expire_delta: SQL error: ".$self->{_dbh}->errstr()); Thanks to Mark Martinec and Nils for the patch and analysis.
Committed to trunk: $ svn -m 'simplified SELECT FROM bayes_token to be able to use index, see bug 5661' ci Sending lib/Mail/SpamAssassin/BayesStore/SQL.pm Committed revision 580455. This seems simple enough, tentatively setting target to 3.2.4. Time for a review?
Mark, could you create a patch against 3.2.4, and attach it? I'm guessing it's the one that's inline in the first comment, but it's better to be explicit.
Created attachment 4141 [details] proposed patch to 3.2.3
+1
committed to 3.2.x: r604713