Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-2955

Create a Cloud Bigtable HBase connector

Details

    • New Feature
    • Status: Resolved
    • P4
    • Resolution: Abandoned
    • None
    • Not applicable
    • io-java-gcp
    • None

    Description

      The Cloud Bigtable (CBT) team has had a Dataflow connector maintained in a different repo for awhile. Recently, we did some reworking of the Cloud Bigtable client that would allow it to better coexist in the Beam ecosystem, and we also released a Beam connector in our repository that exposes HBase idioms rather than the Protobuf idioms of BigtableIO. More information about the customer experience of the HBase connector can be found here: https://cloud.google.com/bigtable/docs/dataflow-hbase.

      The Beam repo is a much better place to house a Cloud Bigtable HBase connector. There are a couple of ways we can implement this new connector:

      1. The CBT connector depends on artifacts in the io/hbase maven project. We can create a new extend HBaseIO for the purposes of CBT. We would have to add some features to HBaseIO to make that work (dynamic rebalancing, and a way for HBase and CBT's size estimation models to coexist)
      2. The BigtableIO connector works well, and we can add an adapter layer on top of it. I have a proof of concept of it here: https://github.com/sduskis/cloud-bigtable-client/tree/add_beam/bigtable-dataflow-parent/bigtable-hbase-beam.
      3. We can build a separate CBT HBase connector.

      I'm happy to do the work. I would appreciate some guidance and discussion about the right approach.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sduskis Solomon Duskis
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: