Uploaded image for project: 'Comdev GSOC'
  1. Comdev GSOC
  2. GSOC-131

[GSoC][Doris]Page Cache Improvement

    XMLWordPrintableJSON

Details

    Description

      Apache Doris
      Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
      Page: https://doris.apache.org

      Github: https://github.com/apache/doris

      Background

      Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored.
      Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems: 

      • Hot data will be phased out in large queries
      • The page cache configuration is immutable and does not support GC.

      Task

      • Phase One: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required.
      • Phase Two: Improve the cache strategy for Apache Doris based on the results from Phase One.

      Learning Material

      Page: https://doris.apache.org
      Github: https://github.com/apache/doris

      Mentor

      Attachments

        Activity

          People

            Unassigned Unassigned
            luzhijing Zhijing Lu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: