Uploaded image for project: 'Comdev GSOC'
  1. Comdev GSOC
  2. GSOC-121

[GSoC][Doris] Supports BigQuery/Apache Kudu/Apache Cassandra/Apache Druid in Federated Queries

    XMLWordPrintableJSON

Details

    Description

      Apache Doris
      Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
      Page: https://doris.apache.org
      Github: https://github.com/apache/doris

      Background

      Apache Doris supports acceleration of queries on external data sources to meet users' needs for federated queries and analysis.
      Currently, Apache Doris supports multiple external catalogs including those from Hive, Iceberg, Hudi, and JDBC. Developers can connect more data sources to Apache Doris based on a unified framework.

      Objective

      Task
      Phase One:

      • Get familiar with the Multi-Catalog structure of Apache Doris, including the metadata synchronization mechanism in FE and the data reading mechanism of BE.
      • Investigate how metadata should be acquired and how data access works regarding the picked data source(s); produce the corresponding design documentation.

      Phase Two:

      • Develop connections to the picked data source(s) and implement access to metadata and data.

      Learning Material

      Page: https://doris.apache.org
      Github: https://github.com/apache/doris

      Mentor

      Attachments

        Activity

          People

            Unassigned Unassigned
            luzhijing Zhijing Lu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: