Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2438

[Umbrella] [RFC-34] Implement BigQuerySyncTool for BigQuery Sync

    XMLWordPrintableJSON

Details

    • 0
    • Hudi-BigQuery

    Description

      BigQuery is Google Cloud's fully managed, petabyte-scale, and cost-effective analytics data warehouse that lets you run analytics over vast amounts of data in near real-time. BigQuery currently doesn’t support Apache Hudi file format, but it has support for the Parquet file format. The proposal is to implement a BigQuerySync similar to HiveSync to sync the Hudi table as the BigQuery External Parquet table so that users can query the Hudi tables using BigQuery. Uber is already syncing some of its Hudi tables to BigQuery data mart this will help them to write, sync, and query.

       

      More details are in RFC-34: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=188745980

      Attachments

        Issue Links

          Activity

            People

              vino Vinoth Govindarajan
              vino Vinoth Govindarajan
              Raymond Xu, Vinoth Chandar
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: