Uploaded image for project: 'Comdev GSOC'
  1. Comdev GSOC
  2. GSOC-270

[GSoC][HugeGraph] Support Memory Management Module

    XMLWordPrintableJSON

Details

    Description

      Apache HugeGraph(incubating) is a fast-speed and highly-scalable graph database. Billions of vertices and edges can be easily stored into and queried from HugeGraph due to its excellent OLTP ability.
       

      Description

      When the JVM GC performs a large amount of garbage collection, the latency of the request is often high, and the response time becomes uncontrollable. To reduce request latency and response time jitter, the hugegraph-server graph query engine has already used off-heap memory in most OLTP algorithms.
       
      However, at present, hugegraph cannot control memory based on a single request Query, so a Query may exhaust the memory of the entire process and cause OOM, or even cause the service to be unable to respond to other requests. To solve this problem, we can implement a memory management module based on a single Query. Applicants will work with community developers to complete this task, and the specific implementation plan and division of labor/priority can be adjusted as needed.
       
      Overall, it can be divided into 3 modules:

      1. Memory management implementation module. Implement the life cycle management of memory objects, memory capacity restrictions and other functions, and provide the Allocator interface (including allocation, release interface). This is a relatively independent module.
      2. Integrate the Allocator module into the HugeGraph context and provide a unified interface for memory transformation.
      3. Transform the places where a large amount of memory is occupied, and adapt to use Allocator for object allocation and release.

      Recommended Skills

      1. Java/JVM Basics: Deep understanding of Java's memory model, including the management and operation of heap memory and off-heap memory.
      2. Java NIO: Java NIO library provides an interface for operating off-heap memory, which needs to be mastered. (Familiarity with Netty or other memory management basic libraries is preferred)
      3. Concurrent Programming: Since memory management involves multi-thread concurrent operations, it is necessary to have knowledge of concurrent programming and multi-thread safety.
      4. Data Structures: Understand and apply appropriate data structures to manage memory, such as using queues, stacks, etc., to manage memory blocks.
      5. Operating System: Understand the memory management mechanism of the operating system in order to better understand and optimize Java's off-heap memory management.

      Task List

      • Implement a unified memory pool, independently manage JVM off-heap memory, and adapt the memory allocation methods of various native collections, so that the memory mainly used by the algorithm comes from the unified memory pool, and it is returned to the memory pool after release.
      • Each request corresponds to a unified memory pool, and the memory usage of a request can be controlled by counting the memory usage of a request.
      • Complete related unit tests UT and basic documentation (better with the perf diff).

       

      PS: More tech details could refer: hugegraph/wiki/MemoryManagement

      • Difficulty: Hard
      • Project size: ~350 hour (full-time/large)

      Potential Mentor

      • Jermy Li: jermy@apache.org (Apache HugeGraph PPMC)
      • Imba Jin: jin@apache.org  (Apache HugeGraph PPMC)
      • Yan Zhang: vaughn@apache.org (Apache HugeGraph PPMC)

      Attachments

        Activity

          People

            Unassigned Unassigned
            jin Imba Jin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: