Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12044

JdbcIO should explicitly setAutoCommit to false

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.28.0
    • 2.29.0
    • sdk-java-core
    • None

    Description

      Hello,

      Per PostgreSQL JDBC documentation, autocommit must be explicitly disabled on the connection to allow cursor streaming.

      jkff mentionned it on the mailing list, however even if there is:

      poolableConnectionFactory.setDefaultAutoCommit(false);
      

      in JdbcIO:1555, currently, at least with JDBC driver 42.2.16, any read with JdbcIO will memoize the whole dataset (which leads to OOM), since

      connection.getAutoCommit()
      

      returns true in JdbcIO#ReadFn#processElement.

      I can provide a PR — the patch is pretty simple (and solves the problem for us in 2.28.0):

      if (connection == null) {
              connection = dataSource.getConnection();
      }
      connection.setAutoCommit(false); // line added
      

      Thanks!

      Attachments

        Activity

          People

            Unassigned Unassigned
            sveyrie Sylvain Veyrié
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h
                3h