Uploaded image for project: 'Tephra'
  1. Tephra
  2. TEPHRA-232

Transaction metadata sent on each put is too big

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.11.0-incubating, 0.12.0-incubating
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
      HBase 1.2.0-cdh5.11
      CentOS 7.3
      4x machines
      Bandwidth between machines 1Gbps

      Description

      I've been testing Tephra 0.11.0 (and more recently 0.12.0) for a project that may need transactions on top of HBase and I find it's performance, for instance, for a bulk load, very poor. Let's not discuss why am I doing a bulk load with transactions.

      In my use case I am generating batches of ~10000 elements and inserting them with the put(List<Put> puts) method. There is no concurrent writers or readers.
      If I do the put without transactions it takes ~0.5s. If I use the TransactionAwareHTable it takes ~12s.
      In both cases the network bandwidth is fully utilised.

      I've tracked down the performance killer to be the addToOperation(OperationWithAttributes op, Transaction tx) on the TransactionAwareHTable.

      I've created a TransactionAwareHTableFix with the addToOperation(txPut, tx) commented, and used it in my code, and each batch started to take ~0.5s.

      Then I checked what was being done inside the addToOperation method and verified that the issue has something to do with the serialization of the Transaction object. The serialized Transaction object has 104171 bytes of length. Considering that it happens for each put, basically my batch of ~10000 elements has ~970MB of serialized transactions, which explains the 12s vs 5s to be processed at the same time that the network is exhausted.

      It seems that the transactions' metadata, despite being sent to HBase, is not stored so the final table size, with or without transactions, is the same.

      Is this metadata encoding and send behaviour expected? This is making Tephra unusable, at least with only 1Gbps bandwidth.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                poorna Poorna Chandra
                Reporter:
                capitao Micael Capitão
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: