Solr
  1. Solr
  2. SOLR-2233

DataImportHandler - JdbcDataSource is not thread safe

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.4, 1.4.1, 1.5, 3.1, 3.2
    • Fix Version/s: None
    • Labels:
      None

      Description

      Whenever Thread A spends more than 10 seconds on a Connection (by retrieving records in a batch), Thread B will close connection.
      Related exceptions happen when we use "threads=" attribute for entity; usually exception stack contains message "connection already closed"
      It shouldn't happen with some JNDI data source, where Connection.close() simply returns Connection to a pool of available connections, but we might get different errors.

      1. SOLR-2233.patch
        4 kB
        Fuad Efendi
      2. SOLR-2233.patch
        1 kB
        Frank Wesemann
      3. SOLR-2233-001.patch
        4 kB
        Fuad Efendi
      4. SOLR-2233-JdbcDataSource.patch
        13 kB
        Fuad Efendi

        Activity

        Hide
        Fuad Efendi added a comment -

        I need to test it; but changes are obvious.
        JDBC API says

        • <strong>Note:</strong> Support for the <code>isLast</code> method
        • is optional for <code>ResultSet</code>s with a result
        • set type of <code>TYPE_FORWARD_ONLY</code>
        • but I am sure everyone supports this.
        Show
        Fuad Efendi added a comment - I need to test it; but changes are obvious. JDBC API says <strong>Note:</strong> Support for the <code>isLast</code> method is optional for <code>ResultSet</code>s with a result set type of <code>TYPE_FORWARD_ONLY</code> but I am sure everyone supports this.
        Hide
        Fuad Efendi added a comment -

        Performance Tuning

        I have extremely sophisticated SQL; root entity runs 10-15 subqueries, and I am unable to use CachedSqlEntityProcessor. That's why I am looking into multithreading.

        Unfortunately, with existing approach connection will be closed after each use. And for most databases creating a connection (authentication, resource allocation) is extremely expensive.

        The best approach is to use container resource (JNDI, connection pooling), but I'll try to find what else can be improved.

        Show
        Fuad Efendi added a comment - Performance Tuning I have extremely sophisticated SQL; root entity runs 10-15 subqueries, and I am unable to use CachedSqlEntityProcessor . That's why I am looking into multithreading. Unfortunately, with existing approach connection will be closed after each use. And for most databases creating a connection (authentication, resource allocation) is extremely expensive . The best approach is to use container resource (JNDI, connection pooling), but I'll try to find what else can be improved.
        Hide
        Fuad Efendi added a comment -

        Connection moved to top-level class
        DataSource should be used in a thread-safe manner; multiple threads can use multiple DataSource (per Item)
        Connection should be closed at the end of import in any case...

        Show
        Fuad Efendi added a comment - Connection moved to top-level class DataSource should be used in a thread-safe manner; multiple threads can use multiple DataSource (per Item) Connection should be closed at the end of import in any case...
        Hide
        Fuad Efendi added a comment -

        This is exception I was talking about, threads="16", 12 sub-entities, with existing trunk version, note The connection is closed

        org.apache.solr.handler.dataimport.DataImportHandlerException: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
                at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
                at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:337)
                at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$600(JdbcDataSource.java:226)
                at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.hasNext(JdbcDataSource.java:260)
                at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:75)
                at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
                at org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper.nextRow(ThreadedEntityProcessorWrapper.java:84)
                at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:433)
                at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:386)
                at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:453)
                at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.access$000(DocBuilder.java:340)
                at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner$1.run(DocBuilder.java:393)
                at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                at java.lang.Thread.run(Thread.java:619)
        Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
                at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:171)
                at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:319)
                at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(SQLServerStatement.java:956)
                at com.microsoft.sqlserver.jdbc.SQLServerResultSet.checkClosed(SQLServerResultSet.java:348)
                at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:915)
                at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:329)
                ... 13 more
        
        Show
        Fuad Efendi added a comment - This is exception I was talking about, threads="16", 12 sub-entities, with existing trunk version, note The connection is closed org.apache.solr.handler.dataimport.DataImportHandlerException: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:337) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$600(JdbcDataSource.java:226) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.hasNext(JdbcDataSource.java:260) at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:75) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73) at org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper.nextRow(ThreadedEntityProcessorWrapper.java:84) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:433) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:386) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:453) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.access$000(DocBuilder.java:340) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner$1.run(DocBuilder.java:393) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:619) Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:171) at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:319) at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(SQLServerStatement.java:956) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.checkClosed(SQLServerResultSet.java:348) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:915) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:329) ... 13 more
        Hide
        Fuad Efendi added a comment -

        resultSet.next() - Microsoft JDBC doesn't support isLast() for FORWARD_ONLY

        Show
        Fuad Efendi added a comment - resultSet.next() - Microsoft JDBC doesn't support isLast() for FORWARD_ONLY
        Hide
        Fuad Efendi added a comment -

        And some real-life test, root entity contains 10 subentities, 16 threads allocated,

        befor

          <str name="Time Elapsed">0:1:0.322</str> 
          <str name="Total Requests made to DataSource">7296</str> 
          <str name="Total Rows Fetched">8061</str> 
          <str name="Total Documents Processed">729</str> 
        

        after

          <str name="Time Elapsed">0:1:1.184</str> 
          <str name="Total Requests made to DataSource">0</str> 
          <str name="Total Rows Fetched">29247</str> 
          <str name="Total Documents Processed">2639</str> 
        

        Look at it, it seems we don't unnecessarily close connection!
        Total Requests made to DataSource: 0

        Show
        Fuad Efendi added a comment - And some real-life test, root entity contains 10 subentities, 16 threads allocated, befor <str name= "Time Elapsed" >0:1:0.322</str> <str name= "Total Requests made to DataSource" >7296</str> <str name= "Total Rows Fetched" >8061</str> <str name= "Total Documents Processed" >729</str> after <str name= "Time Elapsed" >0:1:1.184</str> <str name= "Total Requests made to DataSource" >0</str> <str name= "Total Rows Fetched" >29247</str> <str name= "Total Documents Processed" >2639</str> Look at it, it seems we don't unnecessarily close connection! Total Requests made to DataSource: 0
        Hide
        Fuad Efendi added a comment -

        The only remaining problem is what to do if Database Server closed/dropped connection or something like that (for instance, due to timeout settings on a database, or due to heavy load, or network problem). The more time required to index data, the more frequent problems.

        Even connection pool (accessed via JNDI) won't help because existing (and new) code tries to keep the same connection for a long time, without any logic to check that connection is still alive. What to do if we are in the middle of RecordSet and database dropped connection?

        Show
        Fuad Efendi added a comment - The only remaining problem is what to do if Database Server closed/dropped connection or something like that (for instance, due to timeout settings on a database, or due to heavy load, or network problem). The more time required to index data, the more frequent problems. Even connection pool (accessed via JNDI) won't help because existing (and new) code tries to keep the same connection for a long time, without any logic to check that connection is still alive. What to do if we are in the middle of RecordSet and database dropped connection?
        Hide
        cmd added a comment -

        It seems more slowly than before in the test .why ?

        Show
        cmd added a comment - It seems more slowly than before in the test .why ?
        Hide
        Fuad Efendi added a comment -

        It is 3 times faster "after" I applied changes:
        Before: 729 documents/minute
        After: 2639 documents/minute

        In my test, with 10 sub-entities some of them are multi-valued (and hard to use CachedJdbcDataSource for composite PKs). I can't explain it by only threads="16" option (which this patch makes possible). It is probably "Connection Close / Connection Open" issue which is very expensive for SQL-Server (except MySQL JDBC driver which internally uses connection pooling)

        Show
        Fuad Efendi added a comment - It is 3 times faster "after" I applied changes: Before: 729 documents/minute After: 2639 documents/minute In my test, with 10 sub-entities some of them are multi-valued (and hard to use CachedJdbcDataSource for composite PKs). I can't explain it by only threads="16" option (which this patch makes possible). It is probably "Connection Close / Connection Open" issue which is very expensive for SQL-Server (except MySQL JDBC driver which internally uses connection pooling)
        Hide
        Frank Wesemann added a comment -

        Fuad,
        as the patch is not applyable for me ( too much noise about reformated lines etc.)
        Do I see it right that you essentially replaced getConnection() with factory.call() ?

        Show
        Frank Wesemann added a comment - Fuad, as the patch is not applyable for me ( too much noise about reformated lines etc.) Do I see it right that you essentially replaced getConnection() with factory.call() ?
        Hide
        Fuad Efendi added a comment - - edited

        Hi Frank, yes, correct; although it's hard to recall what I did... unfortunately reformatted... I can resubmit (apply patch & format with Lucene style & generate patch); but better to redo it from scratch again. Existing code doesn't run multithreaded; and it is slow even for single-thread (inappropriate JDBC usage)

        I completely removed this code:

          private Connection getConnection() throws Exception {
            long currTime = System.currentTimeMillis();
            if (currTime - connLastUsed > CONN_TIME_OUT) {
              synchronized (this) {
                Connection tmpConn = factory.call();
                closeConnection();
                connLastUsed = System.currentTimeMillis();
                return conn = tmpConn;
            } else {
              connLastUsed = currTime;
              return conn;
            }
          }
        
        Show
        Fuad Efendi added a comment - - edited Hi Frank, yes, correct; although it's hard to recall what I did... unfortunately reformatted... I can resubmit (apply patch & format with Lucene style & generate patch); but better to redo it from scratch again. Existing code doesn't run multithreaded; and it is slow even for single-thread (inappropriate JDBC usage) I completely removed this code: private Connection getConnection() throws Exception { long currTime = System .currentTimeMillis(); if (currTime - connLastUsed > CONN_TIME_OUT) { synchronized ( this ) { Connection tmpConn = factory.call(); closeConnection(); connLastUsed = System .currentTimeMillis(); return conn = tmpConn; } else { connLastUsed = currTime; return conn; } }
        Hide
        Fuad Efendi added a comment -

        Existing implementation uses single Connection during 10 seconds time interval, and even shares this object with other threads (if you try multithreaded)

        So that problem becomes environment & vendor specific: to open new connection to Oracle 10g, for instance, we need to authenticate, and in "dedicated server" it might take a long, plus "dedicated" resources for each connection, - server can get overloaded. MySQL, fro another side, does not closes connection internally (even if you call conn.close() in your code); connection will be simply returned to a pool of connection objects. And what if something goes wrong... (what if MySQL or Oracle internals need additional time for "closing", "opening", ...) - we might even get problems like "too many connections". Modern apps don't see that because they use manageable connection pooling instead of close-open...

        I need to verify this patch; it was quick solution to make "threads=..." attribute working, and it currently works in production system (MS-SQL).

        Show
        Fuad Efendi added a comment - Existing implementation uses single Connection during 10 seconds time interval, and even shares this object with other threads (if you try multithreaded) So that problem becomes environment & vendor specific: to open new connection to Oracle 10g, for instance, we need to authenticate, and in "dedicated server" it might take a long, plus "dedicated" resources for each connection, - server can get overloaded. MySQL, fro another side, does not closes connection internally (even if you call conn.close() in your code); connection will be simply returned to a pool of connection objects. And what if something goes wrong... (what if MySQL or Oracle internals need additional time for "closing", "opening", ...) - we might even get problems like "too many connections". Modern apps don't see that because they use manageable connection pooling instead of close-open... I need to verify this patch; it was quick solution to make "threads=..." attribute working, and it currently works in production system (MS-SQL).
        Hide
        Frank Wesemann added a comment -

        For a less obtrusive solution I came up with the attached patch.

        People with more knowledge may do something useful in the isConnectionValid() method.

        As I had to learn postgresql Driver for jdbc4 does not implement isValid(timeout).

        May be one can trick the connectionpool to validate the connection after a certain amount of time.

        Show
        Frank Wesemann added a comment - For a less obtrusive solution I came up with the attached patch. People with more knowledge may do something useful in the isConnectionValid() method. As I had to learn postgresql Driver for jdbc4 does not implement isValid(timeout) . May be one can trick the connectionpool to validate the connection after a certain amount of time.
        Hide
        Fuad Efendi added a comment -

        Hi Frank, thanks for the patch; unfortunately it is not thread safe... if you don't mind let me continue working on this, I want to use internal connection pool (if JNDI data source is not available)...

        My initial patch already contains too much; and new one will remove ResultSetIterator and make it much simlper to understand (and multithreaded); and code shoulnd't have any dependency on rare optionally supported patterns such as ResultSet.TYPE_FORWARD_ONLY; READ_ONLY should be managed differently (and it is hard to manage if data size is huge and data is concurrently updated while we are importing it)
        Possible solution could be connection.close() after reading each single record (and initial query should return PKs of records) - but it would be next step... I wrote initial patch for a production system where complex 10-query-based documents (about 500k docs) took many hours to import (and now it is about 40 minutes only) (and what happens if we have network problem and we are in the middre of Iterator?)

        Thanks

        Show
        Fuad Efendi added a comment - Hi Frank, thanks for the patch; unfortunately it is not thread safe... if you don't mind let me continue working on this, I want to use internal connection pool (if JNDI data source is not available)... My initial patch already contains too much ; and new one will remove ResultSetIterator and make it much simlper to understand (and multithreaded); and code shoulnd't have any dependency on rare optionally supported patterns such as ResultSet.TYPE_FORWARD_ONLY; READ_ONLY should be managed differently (and it is hard to manage if data size is huge and data is concurrently updated while we are importing it) Possible solution could be connection.close() after reading each single record (and initial query should return PKs of records) - but it would be next step... I wrote initial patch for a production system where complex 10-query-based documents (about 500k docs) took many hours to import (and now it is about 40 minutes only) (and what happens if we have network problem and we are in the middre of Iterator?) Thanks
        Hide
        Fuad Efendi added a comment -

        Revised version of old patch (11-Nov-2010); previous version of patch was hard to read

        Main changes:

        • connection won't close & reopen after timeout
        • connection can't be closed by second thread unexpectedly to first thread (initial bug fixed)

        Please note it works fine with MS-SQL server. However, concurrent statements (in concurrent threads) via the same connection object is tricky, JDBC may or may not implement it (JDBC-ODBC bridge for instance)

        Show
        Fuad Efendi added a comment - Revised version of old patch (11-Nov-2010); previous version of patch was hard to read Main changes: connection won't close & reopen after timeout connection can't be closed by second thread unexpectedly to first thread (initial bug fixed) Please note it works fine with MS-SQL server. However, concurrent statements (in concurrent threads) via the same connection object is tricky, JDBC may or may not implement it (JDBC-ODBC bridge for instance)
        Hide
        Fuad Efendi added a comment -

        Note that with this implementation "connection" is closed only when main instance of main class finalized => connection never closed; so that the code is still naive (server can close connection - how will we know that?) - fortunately it doesn't happen in my specific case already few months of night imports...

        We should use connection pooling - this would be next improvement; conn.close() in this case will return connection to pool (without closing it), and pool is responsible for testing connections for liveness.

        Show
        Fuad Efendi added a comment - Note that with this implementation "connection" is closed only when main instance of main class finalized => connection never closed; so that the code is still naive (server can close connection - how will we know that?) - fortunately it doesn't happen in my specific case already few months of night imports... We should use connection pooling - this would be next improvement; conn.close() in this case will return connection to pool (without closing it), and pool is responsible for testing connections for liveness.
        Hide
        Fuad Efendi added a comment -
        • small bug with closeResources()
        • each ResultSetIterator now has own (separate) instance of Connection - extremely good for performance (multithreading) but it is not transactional (different connections can return different results) - but we are "optimistic"
        Show
        Fuad Efendi added a comment - small bug with closeResources() each ResultSetIterator now has own (separate) instance of Connection - extremely good for performance (multithreading) but it is not transactional (different connections can return different results) - but we are "optimistic"
        Hide
        Fuad Efendi added a comment -

        to avoid mistakes I added version... SOLR-2233-001.patch
        (previous attachment was wrong)

        Show
        Fuad Efendi added a comment - to avoid mistakes I added version... SOLR-2233 -001.patch (previous attachment was wrong)
        Hide
        David Webb added a comment -

        I am getting truly poor performance with my multi-threaded DIH Using the JdbcDataSource. This issue sounds like my problem considering I can only see 1 new active session made on the Oracle Database, even though threads="12". I am running Solr 3.4, but there is no Fix Version associated with this Jira, so does that mean this patch never got commited?

        Any other suggestions for Solr 3.4 DIH Multithreaded?

        Thank you.

        Show
        David Webb added a comment - I am getting truly poor performance with my multi-threaded DIH Using the JdbcDataSource. This issue sounds like my problem considering I can only see 1 new active session made on the Oracle Database, even though threads="12". I am running Solr 3.4, but there is no Fix Version associated with this Jira, so does that mean this patch never got commited? Any other suggestions for Solr 3.4 DIH Multithreaded? Thank you.
        Hide
        Mikhail Khludnev added a comment -

        David,

        Please show your config excerpt, jstack and fullimport logs (DEBUG on o.a.solr.dataimport), your connection pool config is quite appreciate too. I can't say that I know which issue you are talking about.

        I have patches for DIH multi-threadng and CachedSqlDatasource at trunk, 4.0. SOLR-2933 SOLR-2947. Pls vote. Unfortunately it will be too much efforts to do the same at 3.x. So, we need it committed and after that they can be ported with SOLR-2382, or you will be able to try 4.0 DIH jars.


        Mikhail

        Show
        Mikhail Khludnev added a comment - David, Please show your config excerpt, jstack and fullimport logs (DEBUG on o.a.solr.dataimport), your connection pool config is quite appreciate too. I can't say that I know which issue you are talking about. I have patches for DIH multi-threadng and CachedSqlDatasource at trunk, 4.0. SOLR-2933 SOLR-2947 . Pls vote. Unfortunately it will be too much efforts to do the same at 3.x. So, we need it committed and after that they can be ported with SOLR-2382 , or you will be able to try 4.0 DIH jars. – Mikhail
        Hide
        James Dyer added a comment -

        The "threads" feature was removed from DIH in Trunk/4.x (see SOLR-3262). Some "threads" bugs were fixed in version 3.6, the last release in which "threads" is available. (see SOLR-3011).

        Show
        James Dyer added a comment - The "threads" feature was removed from DIH in Trunk/4.x (see SOLR-3262 ). Some "threads" bugs were fixed in version 3.6, the last release in which "threads" is available. (see SOLR-3011 ).

          People

          • Assignee:
            Unassigned
            Reporter:
            Fuad Efendi
          • Votes:
            3 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development