Uploaded image for project: 'Tephra'
  1. Tephra
  2. TEPHRA-241

Introduce a way to limit the size of a transaction

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.12.0-incubating
    • Fix Version/s: 0.13.0-incubating
    • Component/s: api, manager
    • Labels:
      None

      Description

      When clients perform a huge number of writes in a short transaction, that can result in huge change sets. For example, if a client performs 10M writes and sends that change set over, that can easily be 1GB large. The transaction manager will keep this in memory. It will also write this as an edit to the transaction log.

      Assume it runs out of memory because the change set is too large. It crashes and when it restarts, it will replay the log, load that huge change set again, and crash again.

      To prevent this kind of systemic failure, and to encourage developers to use long transactions when performing many writes, we can introduce two new properties in the configuration:

      • change set warn threshold: if a change set exceeds this size, a warning is logged.
      • change set reject threshold: if a change set exceeds this size, it is rejected (canCommit will throw an exception) and that will fail the transaction.

      Both thresholds should be Long.MAX_VALUE by default, to preserve existing behavior after upgrade.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/incubator-tephra/pull/48

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/incubator-tephra/pull/48
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user anew commented on the issue:

          https://github.com/apache/incubator-tephra/pull/48

          Same travis failure. This appears to happen a lot more frequently with Java 8. I will commit this now and try to fix the flaky test before 0.13 release.

          Show
          githubbot ASF GitHub Bot added a comment - Github user anew commented on the issue: https://github.com/apache/incubator-tephra/pull/48 Same travis failure. This appears to happen a lot more frequently with Java 8. I will commit this now and try to fix the flaky test before 0.13 release.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user anew commented on the issue:

          https://github.com/apache/incubator-tephra/pull/48

          After rebasing on latest master, another travis build is running; waiting for that to finish.

          Show
          githubbot ASF GitHub Bot added a comment - Github user anew commented on the issue: https://github.com/apache/incubator-tephra/pull/48 After rebasing on latest master, another travis build is running; waiting for that to finish.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user anew commented on the issue:

          https://github.com/apache/incubator-tephra/pull/48

          Travis build failed again in the known flaky test with Java 8. It does pass when I run it with Java 8 locally. Java 7 also passes in Travis. I am confident that this was not introduced by the changes in this PR, and will commit this now.

          Show
          githubbot ASF GitHub Bot added a comment - Github user anew commented on the issue: https://github.com/apache/incubator-tephra/pull/48 Travis build failed again in the known flaky test with Java 8. It does pass when I run it with Java 8 locally. Java 7 also passes in Travis. I am confident that this was not introduced by the changes in this PR, and will commit this now.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user anew commented on a diff in the pull request:

          https://github.com/apache/incubator-tephra/pull/48#discussion_r136458250

          — Diff: tephra-core/src/main/java/org/apache/tephra/distributed/thrift/TTransactionServer.java —
          @@ -68,6 +68,8 @@

          public TBoolean canCommitTx(TTransaction tx, Set<ByteBuffer> changes) throws TTransactionNotInProgressException, org.apache.thrift.TException;

          + public TBoolean canCommitOrThrow(TTransaction tx, Set<ByteBuffer> changes) throws TTransactionNotInProgressException, TGenericException, org.apache.thrift.TException;
          — End diff –

          please do not review this file, it is generated by Thrift. Only need to review transaction.thrift

          Show
          githubbot ASF GitHub Bot added a comment - Github user anew commented on a diff in the pull request: https://github.com/apache/incubator-tephra/pull/48#discussion_r136458250 — Diff: tephra-core/src/main/java/org/apache/tephra/distributed/thrift/TTransactionServer.java — @@ -68,6 +68,8 @@ public TBoolean canCommitTx(TTransaction tx, Set<ByteBuffer> changes) throws TTransactionNotInProgressException, org.apache.thrift.TException; + public TBoolean canCommitOrThrow(TTransaction tx, Set<ByteBuffer> changes) throws TTransactionNotInProgressException, TGenericException, org.apache.thrift.TException; — End diff – please do not review this file, it is generated by Thrift. Only need to review transaction.thrift
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user anew commented on a diff in the pull request:

          https://github.com/apache/incubator-tephra/pull/48#discussion_r136457929

          — Diff: tephra-core/src/test/java/org/apache/tephra/TransactionManagerTest.java —
          @@ -54,115 +54,35 @@ protected TransactionStateStorage getStateStorage()

          { return txStateStorage; }
          • @Before
          • public void before() {
            + @BeforeClass
            + public static void beforeClass() { + conf = getCommonConfiguration(); conf.setInt(TxConstants.Manager.CFG_TX_CLEANUP_INTERVAL, 0); // no cleanup thread - conf.setInt(TxConstants.Manager.CFG_TX_MAX_TIMEOUT, (int) TimeUnit.DAYS.toSeconds(5)); // very long limit // todo should create two sets of tests, one with LocalFileTxStateStorage and one with InMemoryTxStateStorage txStateStorage = new InMemoryTransactionStateStorage(); txManager = new TransactionManager (conf, txStateStorage, new TxMetricsCollector()); txManager.startAndWait(); }
          • @After
          • public void after() {
            + @AfterClass
            + public static void afterClass() { txManager.stopAndWait(); }
          • @Test
              • End diff –

          this test moved to TransactionSystemTest without change.

          Show
          githubbot ASF GitHub Bot added a comment - Github user anew commented on a diff in the pull request: https://github.com/apache/incubator-tephra/pull/48#discussion_r136457929 — Diff: tephra-core/src/test/java/org/apache/tephra/TransactionManagerTest.java — @@ -54,115 +54,35 @@ protected TransactionStateStorage getStateStorage() { return txStateStorage; } @Before public void before() { + @BeforeClass + public static void beforeClass() { + conf = getCommonConfiguration(); conf.setInt(TxConstants.Manager.CFG_TX_CLEANUP_INTERVAL, 0); // no cleanup thread - conf.setInt(TxConstants.Manager.CFG_TX_MAX_TIMEOUT, (int) TimeUnit.DAYS.toSeconds(5)); // very long limit // todo should create two sets of tests, one with LocalFileTxStateStorage and one with InMemoryTxStateStorage txStateStorage = new InMemoryTransactionStateStorage(); txManager = new TransactionManager (conf, txStateStorage, new TxMetricsCollector()); txManager.startAndWait(); } @After public void after() { + @AfterClass + public static void afterClass() { txManager.stopAndWait(); } @Test End diff – this test moved to TransactionSystemTest without change.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user anew opened a pull request:

          https://github.com/apache/incubator-tephra/pull/48

          TEPHRA-241 Introduce a way to limit the size of a transaction

          • first commit introduces four new configurations for change set size limits, enforces these limits in a new method canCommitOrThrow() (we keep canCommit() for compatibility), and uses the new method throughout the higher level classes and tests
          • second commit refactors the transaction system test cases: Don't start a new TxManager for each test method, have a shared common configuration across subclasses, move the checkpoint test to the base class such that it also gets tested with Thrift
          • third commit adds a new test case for size limit validation

          Thrift changes are backward-compatible (we keep the existing methods the same), and so are the TransactionSystemClient changes.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/anew/incubator-tephra tephra-241

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/incubator-tephra/pull/48.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #48


          commit 7227bcef076ccf762b78bf299003c373e7e73814
          Author: anew <anew@apache.org>
          Date: 2017-08-31T21:43:32Z

          add new config properties and a new method canCommitOrThrow() that validates the size of change sets

          commit 4ad0f924c34ad4aea35443c8cd10bd81cf24eeb8
          Author: anew <anew@apache.org>
          Date: 2017-08-31T21:43:56Z

          refactor transaction system test

          commit f3a1e786ae62f6bf05043c6fd770338361e615bc
          Author: anew <anew@apache.org>
          Date: 2017-08-31T21:44:32Z

          add new test for change set size validation


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user anew opened a pull request: https://github.com/apache/incubator-tephra/pull/48 TEPHRA-241 Introduce a way to limit the size of a transaction first commit introduces four new configurations for change set size limits, enforces these limits in a new method canCommitOrThrow() (we keep canCommit() for compatibility), and uses the new method throughout the higher level classes and tests second commit refactors the transaction system test cases: Don't start a new TxManager for each test method, have a shared common configuration across subclasses, move the checkpoint test to the base class such that it also gets tested with Thrift third commit adds a new test case for size limit validation Thrift changes are backward-compatible (we keep the existing methods the same), and so are the TransactionSystemClient changes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anew/incubator-tephra tephra-241 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-tephra/pull/48.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #48 commit 7227bcef076ccf762b78bf299003c373e7e73814 Author: anew <anew@apache.org> Date: 2017-08-31T21:43:32Z add new config properties and a new method canCommitOrThrow() that validates the size of change sets commit 4ad0f924c34ad4aea35443c8cd10bd81cf24eeb8 Author: anew <anew@apache.org> Date: 2017-08-31T21:43:56Z refactor transaction system test commit f3a1e786ae62f6bf05043c6fd770338361e615bc Author: anew <anew@apache.org> Date: 2017-08-31T21:44:32Z add new test for change set size validation
          Hide
          anew Andreas Neumann added a comment -

          Also, the client can check the size of the change set and avoid sending it over the wire if it exceeds the max allowed size.

          Show
          anew Andreas Neumann added a comment - Also, the client can check the size of the change set and avoid sending it over the wire if it exceeds the max allowed size.

            People

            • Assignee:
              anew Andreas Neumann
              Reporter:
              anew Andreas Neumann
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development