Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Duplicate
-
2.7.0
-
None
Description
We have a streaming pipeline running on Dataflow which writes data to a PostgreSQL instance hosted on Cloud SQL. This database is suffering from connection increases on a regular but unpredictable basis without particular reason.
Latest example was on Friday 18th January 2019 (see attached file).
(The spike in the middle is unrelated to this issue as it belongs to a periodic batch pipeline).
Investigations in the GCP logs provides following warning happening at the same time as the connection increases:
2019-01-18 05:52:11.067 HNEC Can't verify serialized elements of type SessionData have well defined equals method. This may produce incorrect results on some PipelineRunner
This log line is present 13 times in a very short interval, between 05:52:11:11.067 and 05:52:11:11.126.
The SessionData are custom objects which inherits java.io.Serializable. They are written to the PostgreSQL database using:
pipeline.apply(JdbcIO.<SessionData>write()
.withDataSourceConfiguration(ExtractFunctions.getDataSourceConfiguration(options.instance, options.db_login,options.db_password))
.withStatement("SQL_STATEMENT")
.withPreparedStatementSetter(new InsertSessionPrepareStatementSetter()));
Looking at pg_stat_activity in the psql instance, all connections are used. Using select * from pg_stat_activity where state = 'idle' and query = 'ROLLBACK';, no result is returned.