From:                              stack <saint.ack@gmail.com>

Sent:                               Saturday, 28 June 2014 3:03 p.m.

To:                                   Andreas Neumann

Cc:                                   Schauble, Rob (SeaQuest R&D); Ding, Hong; Mishra, Atanu; Gary Helmling; Kakarlamudi, Rao; Sharma, Anoop; Ted Yu; Jeffrey Zhong; DeRoo, John; enis@apache.org; Goyal, Narendra (SeaQuest R&D); Birdsall, Dave; James Taylor; Jain, Rohit (Trafodion); Zeller, Hans

Subject:                          Re: Proposal for a generic transaction API for HBase

 

Lads.  The back and forth that is going on here is of too high a quality to keep private. Good stuff

On Jun 27, 2014 5:18 PM, "Andreas Neumann" <andreas@continuuity.com> wrote:

[I think you accidentally dropped other folks, adding them back]

 

I don’t mind if there is a transactional table (say THTable)  that extends HTable, but rather than changing the signature of every method, the Transaction could be set by an additional method. So I could do:

 

table.beginTransaction(tx);

table.put(put);

 

That means a THTable can only be used with a single transaction at the same time, so concurrent transactions would have to have their own THTables. But the advantage is that no existing code will break that already uses HTable, it can simply be applied to a THTable. Now I could do this:

 

THTable table; 

table.beginTransaction();

// call a legacy code method that expects HTable and does not know about tx

legacyMethod(table, …);

 

try {

table.commitTransaction();

}  catch (…)

...

 

I think that is a cleaner API, decoupling the “transaction” feature from the other features of the API.

 

Again, let’s put this onto the public list soon.

 

-Andreas.

 

 

 

 

On Jun 27, 2014, at 3:57 AM, Ted Yu <yuzhihong@gmail.com> wrote:



Andreas:

If HTable API isn't changed, how do users specify the transaction with which read / write is associated ?

 

Cheers


On Jun 26, 2014, at 6:47 PM, Andreas Neumann <andreas@continuuity.com> wrote:

Hey John. 

 

this is a good start for the discussion. I have a few high-level comments:

  • My feeling is that transactions should be a feature that is enabled or disabled  for a table, similar to TTL or security, configured through table properties. That should not change the Table API - it should be transparent to a client whether its operations are on a transactional or non-transactional table. Would it be possible tho do this without introducing TransactionalTable?
  • I am a little confused by “HeuristicMixedException”. Is it transactionally correct to rollback some and commit some other operations? Using some heuristic? This is something that a particular implementation may do, but it seems inappropriate in a generic API. 
  • I am not sure whether I understand the concept of suspend() and resume(). Does this mean that - similar to JTA - a given transaction cannot be used by multiple clients at the same time? But there is always a current client associated with the tx, and it needs to suspend, serialize and send it over to another client, which can then deserialize and resume it? That appears to be an implementation-specific limitation, that should not be reflected in this API.
  • The name TransactionManager is misleading to me. In my opinion, this is more of a TransactionSystemClient. Its API is fixed whereas different systems can implement it in different ways, for example with a single TransactionManagerService, or a distributed tx manager ensemble that runs in every client and finds consensus somehow. 

 

Oh, and I second Stack’s +1 to move this discussion to the public list. 

 

Best Regards 

-Andreas.

 

On Jun 26, 2014, at 4:16 PM, stack <saint.ack@gmail.com> wrote:



+1 on a note to dev list w/ pointer to JIRA w/ design attached.

St.Ack

 

On Thu, Jun 26, 2014 at 4:10 PM, Jeffrey Zhong <jzhong@hortonworks.com> wrote:

Hey John,

 

A very good proposal. I think you should create a Hbase JIRA and attach the proposal in the JIRA so more people can provide feedbacks on this. I have several comments for your considerations:

  1. My understanding is that once a table has been included inside a transaction then reads/writes on the table has to been go through transactions otherwise the data integrity isn't guaranteed.  Therefore, it's better we define the semantics  what will happen after TransactionTable is used against a normal Hbase Table.
  2. For  "public static Transaction resume(Byte[] transactionString)", it's better to change it to constructFrom as it's a deserialization call mapping to  streamTo(). Move function "public static void resume(final Transaction transaction);" to Transaction interface because we already defines its counterpart "suspend()" there.
  3. In TransactionManager, I'd like to see an iterator function to list all transactions current TransactionManager knows of.
  4. We need a transaction manager factory per your question as we want people to choose their preferred implementations 
  5. I'd suggest to include some pseudo code snippet on how to use the proposed interfaces. 

Thanks,

-Jeffrey

 

From: Ted Yu <yuzhihong@gmail.com>
Date: Thursday, June 26, 2014 3:11 AM
To: "DeRoo, John" <john.deroo@hp.com>
Cc: James Taylor <jtaylor@salesforce.com>, Andreas Neumann <andreas@continuuity.com>, Gary Helmling <gary@continuuity.com>, Jeffrey Zhong <jzhong@hortonworks.com>, "enis@apache.org" <enis@apache.org>, "Zeller, Hans" <hans.zeller@hp.com>, "Sharma, Anoop" <anoop.sharma@hp.com>, "Goyal, Narendra (SeaQuest R&D)" <Narendra.Goyal@hp.com>, "Jain, Rohit (SeaQuest)" <rohit.jain@hp.com>, "Kakarlamudi, Rao" <rao.kakarlamudi@hp.com>, "Birdsall, Dave" <dave.birdsall@hp.com>, "Mishra, Atanu" <atanu.mishra@hp.com>, "Ding, Hong" <hong.ding@hp.com>, "Schauble, Rob (SeaQuest R&D)" <rob.schauble@hp.com>
Subject: Re: Proposal for a generic transaction API for HBase

 

Specifying isolation level as parameter to TransactionManager.begin() is fine. 

 

w.r.t. Specification of Transaction instance in TransactionTable, would having a setTransaction(Transaction t) method make sense ?

This way, each HTable method would keep same signature.

 

Cheers


On Jun 25, 2014, at 10:30 PM, "DeRoo, John" <john.deroo@hp.com> wrote:

Thanks Ted.

 

Good catch on batch.  I’ll add it.

 

Isolation level is a good point too.  Any suggestions here?  It could be added to the TransactionManager interface and possibly TransactionManager.begin as a parameter.  In SQL I think it’s set globally or per session and applies to transactions rather than tables.   We have thought about it from an implementation viewpoint, but hadn’t thought about how it gets externalized or set.

 

Regards, John.

 

From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Thursday, 26 June 2014 3:59 p.m.
To: DeRoo, John
Cc: James Taylor; Andreas Neumann; Gary Helmling; Jeffrey Zhong; enis@apache.org; Zeller, Hans; Sharma, Anoop; Goyal, Narendra (SeaQuest R&D); Jain, Rohit (SeaQuest); Kakarlamudi, Rao; Birdsall, Dave; Mishra, Atanu; Ding, Hong; Schauble, Rob (SeaQuest R&D)
Subject: Re: Proposal for a generic transaction API for HBase

 

I don't see isolation level being discussed in the document.

Is this intended ?

 

Should a new module, such as hbase-trx, be introduced to host the classes / exceptions proposed here ?

 

For TransactionTable, some methods were not included, e.g.:

  public void batch(final List<? extends Row> actions, final Object[] results)

Should batch() with Transaction as first parameter be added to TransactionTable ?

Will take deeper look tomorrow.

Cheers

 

 

On Wed, Jun 25, 2014 at 8:18 PM, DeRoo, John <john.deroo@hp.com> wrote:

Hi Folks,

 

As we discussed at the meeting a couple of weeks ago, here is a draft proposal for a common/generic transactional interface for HBase.  I’ve tried to keep it simple and implementation independent.  I’ve based it on JTA and so, consequently it also looks quite a lot like the HBase-trx client interface which was also based on JTA.  Please let me know what you think, whether it works in your context (or doesn’t!), what you’d like to changed and so on.  We want this to address transactional needs for as wide a range of HBase applications as possible and have good compatibility between Transaction Managers which support the API.  Keeping in mind that we want to present this at the meetup on July 17, it would be great if you could get comments back to me as soon as possible.

 

Many thanks, John.

 


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.