|
SA Bugzilla – Full Text Bug Listing |
Summary: | RFE: SQL efficiency: keep persistent DB connections (in code with a good license) | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Preston A. Elder <prez> |
Component: | spamc/spamd | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | RESOLVED WONTFIX | ||
Severity: | enhancement | CC: | apache, dev, parkerm |
Priority: | P5 | ||
Version: | unspecified | ||
Target Milestone: | Future | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
Bug Depends on: | 3097 | ||
Bug Blocks: | 4560 | ||
Attachments: |
My Persistent Database connection plugin
the plugin code has been updated |
Description
Preston A. Elder
2003-06-09 00:32:40 UTC
Just adding a comment before I forget about it. In the absence of a pre-fork server that can have persistent database connections DBD::Proxy might be an option. I did a very small about of playing with it today and it at least worked, so it might be fairly easy to get it to cache some db connections for us. It has a pre-fork/threaded server mode already built in and knows how to handle cached_connect calls. spamd is now uses a prefork model (as of r10000). :) I noticed in the new 3.0.0 version, there are two 'SQL conf' files, namely: http://spamassassin.rediris.es/full/3.0.x/dist/lib/Mail/SpamAssassin/ConfSourceSQL.pm and http://spamassassin.rediris.es/full/3.0.x/dist/lib/Mail/SpamAssassin/Conf/SQL.pm What I find interesting is neither stores the DB handle in something like $self->{dbh}, and then later checks to see if its 'active'. Whereas, the SQL support for the bayesan stuff here: http://spamassassin.rediris.es/full/3.0.x/dist/lib/Mail/SpamAssassin/BayesStore/SQL.pm seems to do precisely that. All it would conceivably take would be to replace: my $dbh = DBI->connect($dsn, $dbuser, $dbpass, {'PrintError' => 0}); with: if (!defined($self->{dbh})) { $self->{dbh} = DBI->connect($dsn, $dbuser, $dbpass, {'PrintError' => 0}); } And then replace any reference to $dbh with $self->{dbh}. And of course removing the disconnect. Even better would be to decouple the DB connection and disconnection routines alltogether (as is done in the bayesan stuff). Even better STILL would be to have all DB functions in the same 'pre-forked process' use the same DB handle (regardless of whether its for bayesan, configuration, or other purposes). ie. put the dbh in the 'main' section (which is where the SQL config stuff is read from anyway). But the bare minimum of stopping the continual re-connects would be a huge benefit anyway. moving to 3.1 sql -> assigning to michael move bug to Future milestone (previously set to Future -- I hope) I've written a plugin that handles persistent database connection and it can be found here: http://wiki.apache.org/spamassassin/DBIPlugin Please note that it requires SpamAssassin 3.1+ to work correctly. Closing, this will hopfully eventually make it into the main SpamAssassin codebase. reopening -- we could still do with a plugin that doesn't use DBI due to its licensing problems. This was a suggested idea for the Google Summer of Code 2006; I'm adding it to the bugzilla for future use, and in case anyone feels like implementing it. Subject ID: spamassassin-persistent-db-conns Keywords: perl, databases, sql Description: http://issues.apache.org/SpamAssassin/show_bug.cgi?id=2037 : persistent database connections for SpamAssassin's Bayes subsystem. Michael: 'This exists, but is not an ASL friendly license. So a "clean room" implementation might be cool.' Possible Mentors: Michael Parker (parkerm -at- pobox.com) Back to dev Created attachment 4105 [details]
My Persistent Database connection plugin
This is a plugin written during the period of Google Summer Code 2007. I have
done the benchmark with MySQL and PostgreSQL.
The only real thing this needs is to turn the $CONNECT variable into a hash so that we can have it handle multiple connection strings, but that is a very minor change. Created attachment 4111 [details]
the plugin code has been updated
Closing old stale bug. Not sure if SQL is already persistent to some degree, but Redis should be used for anything high volume anyway. |