Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-900

Reducing memory usage during query processing

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: Physical Operator, Storage
    • Labels:
      None

      Description

      Currently, we have used tuple structures implemented as Java objects. It internally uses Datum objects. Current Tuple structure occupies in JVM heap space. As a result, it is hard to control memory usage, and it is impossible to predict garbage collection. This problem usually becomes severe when Tajo deals with very large data in relatively small cluster and lots of grouping or join keys.

      I've tried various tests and I made some prototype to show the possibility to eliminate this problem.

      The main idea is as follows:

      • Do not use Datum class in expression evaluation. Instead, we should use java primitive type values
        • It will significantly reduces object creations and memory usages
      • Redesign Tuple using direct memory allocation (DirectByteBuffer or Unsafe.allocateMemory)
        • It allows each worker to control memory usages during in-memory operations like sort and hash aggregation/joins.
        • It enables column values to be stored in adjacent memory, improving cache locality.

      In order to achieve the above idea, we should do as follows:

      • implement an alternative (i.e., runtime byte code generation) to EvalNode framework in order to avoid use of Datum and Java objects.
      • Design new tuple data structure using direct memory allocation
      • Refactor existing operators to be controlled according current memory usage

      This is an umbrella issue. I'll create subtasks, and I've already started some issues. I'll use this jira to track them.

        Issue Links

          Activity

          Hide
          hyunsik Hyunsik Choi added a comment -

          The main purpose of this issue was achieved by TAJO-906 and TAJO-907. The remain works will be addressed in TAJO-1041.

          Show
          hyunsik Hyunsik Choi added a comment - The main purpose of this issue was achieved by TAJO-906 and TAJO-907 . The remain works will be addressed in TAJO-1041 .

            People

            • Assignee:
              hyunsik Hyunsik Choi
              Reporter:
              hyunsik Hyunsik Choi
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development