Commons Dbcp
  1. Commons Dbcp
  2. DBCP-377

Dbcp Idle Check Mechanism Doesn't Work

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Cannot Reproduce
    • Affects Version/s: 1.4
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Linux mysql 5.1.4

      Description

      Using :commons-dbcp-1.4.jar version

      we use a distributed data access layer for Database Sharding with mysql
      our web servers use mysql jdbc driver 1.5.4 to connect to this middleware layer with F5 between them for load balance.
      the middleware layer will discard db links that idle for longger than 1 hour,and our dbcp configures to close links that idle for
      longger than 30 minutes.

      here we occurred exceptions as follows:

          java.net.SocketException
      MESSAGE: Broken pipe
      
      STACKTRACE:
      
      java.net.SocketException: Broken pipe
              at java.net.SocketOutputStream.socketWrite0(Native Method)
              at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
              at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
              at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
              at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
              at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:2637)
              at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1554)
              at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1665)
              at com.mysql.jdbc.Connection.execSQL(Connection.java:3176)
              at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1153)
              at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:794)
              ....
              at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)
              at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)
      

      it seems that there are times when the links get from pool they have already been broken,we know that they are forcely closed by the middleware layer、but as dbcp idle check more frequently than the middleware layer(30m to 60m). How can this happen as the idle links
      should have been closed by dbcp first. Are there any problems for the dbcp idle check Mechanism?

      our dbcp configurations are as follows

          <property name="maxActive"><value>20</value></property>
          <property name="initialSize"><value>1</value></property>
          <property name="maxWait"><value>60000</value></property>
          <property name="maxIdle"><value>20</value></property>
          <property name="minIdle"><value>5</value></property>
          <property name="removeAbandoned"><value>true</value></property>
          <property name="removeAbandonedTimeout"><value>180</value></property>
          <property name="connectionProperties"><value>clientEncoding=GBK</value></property>
          <property name="timeBetweenEvictionRunsMillis"><value>60000</value></property>
          <property name="minEvictableIdleTimeMillis"><value>1800000</value></property>
          <property name="testWhileIdle"><value>true</value></property>
          <property name="testOnBorrow"><value>false</value></property>
          <property name="testOnReturn"><value>false</value></property>
          <property name="validationQuery"><value>SELECT @@SQL_MODE</value></property>
          <property name="numTestsPerEvictionRun"><value>32</value></property>
      

        Activity

        Hide
        Mark Thomas added a comment -

        No further information from the OP so closing as I can't reproduce this neither can I see a code path that could trigger it. Feel free to re-open this issue if you experience and are able to provide the steps to reproduce it.

        Show
        Mark Thomas added a comment - No further information from the OP so closing as I can't reproduce this neither can I see a code path that could trigger it. Feel free to re-open this issue if you experience and are able to provide the steps to reproduce it.
        Hide
        yixin he added a comment -

        Hi Mark:

        this could occur one to several times per day under heavy load
        we use commons-pool 1.5.4
        ok we will check whether the app really obtains never-used connectios and then adjust somes configures for more tests.

        thanks for your help.

        Show
        yixin he added a comment - Hi Mark: this could occur one to several times per day under heavy load we use commons-pool 1.5.4 ok we will check whether the app really obtains never-used connectios and then adjust somes configures for more tests. thanks for your help.
        Hide
        Dave Oxley added a comment -

        This might be a red herring but I recently found and fixed a thread safety issue when validating connections in DBCP. I don't know if it is relevant to the idle check mechanism but it might be worth trying out the fix. Find the patch attached to issue DBCP-376

        Show
        Dave Oxley added a comment - This might be a red herring but I recently found and fixed a thread safety issue when validating connections in DBCP. I don't know if it is relevant to the idle check mechanism but it might be worth trying out the fix. Find the patch attached to issue DBCP-376
        Hide
        Shuo QIU added a comment -

        (I am )The broken connection is either closed by F5 or the middleware, for that when dbcp connect directly to middleware rather than through F5, some connections (less than 1 per hour) kept idle longer than 1 hour are closed by middleware (which is showed in middleware's log).

        I suspect:
        1, as Mark said: app could obtain a connection from the pool but never use it
        2, is it possible that, dbcp makes GeneralObjectPool's lock granularity finer which cause concurrent problem. So that evictor miss some connections for check.

        Show
        Shuo QIU added a comment - (I am )The broken connection is either closed by F5 or the middleware, for that when dbcp connect directly to middleware rather than through F5, some connections (less than 1 per hour) kept idle longer than 1 hour are closed by middleware (which is showed in middleware's log). I suspect: 1, as Mark said: app could obtain a connection from the pool but never use it 2, is it possible that, dbcp makes GeneralObjectPool's lock granularity finer which cause concurrent problem. So that evictor miss some connections for check.
        Hide
        Mark Thomas added a comment -

        I'm not aware of any known issues with idle object eviction.

        It shouldn't be the app holding on to an idle connection too long before returning it since the removeAbandoned* settings will handle that after 5 minutes.

        I assume that you are 100% certain that the middleware layer is dropping the connections after 1 hour.

        Can you provide some idea of the frequency this occurs? E.g x failures per hour and y connections borrowed per second. I'm trying to understand how rare the issue is.

        Some things to check:
        1. What version of commons-pool are you using and if it isn't the latest does upgrading fix the issue?
        2. numTestsPerEvictionRun=32 seems too high if maxIdle=20
        3. Is it possible that the app could obtain a connection from the pool but never use it? That would reset DBCP's idle timer but not the middleware layer's.

        Show
        Mark Thomas added a comment - I'm not aware of any known issues with idle object eviction. It shouldn't be the app holding on to an idle connection too long before returning it since the removeAbandoned* settings will handle that after 5 minutes. I assume that you are 100% certain that the middleware layer is dropping the connections after 1 hour. Can you provide some idea of the frequency this occurs? E.g x failures per hour and y connections borrowed per second. I'm trying to understand how rare the issue is. Some things to check: 1. What version of commons-pool are you using and if it isn't the latest does upgrading fix the issue? 2. numTestsPerEvictionRun=32 seems too high if maxIdle=20 3. Is it possible that the app could obtain a connection from the pool but never use it? That would reset DBCP's idle timer but not the middleware layer's.

          People

          • Assignee:
            Unassigned
            Reporter:
            yixin he
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development