Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-743

Change the default resource allocation policy of leaf tasks

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.8.0, 0.9.0
    • 0.8.0, 0.9.0
    • Resource Manager
    • None

    Description

      Currently, resource allocation is calculated by memory base. If a machine have a large memory, in default settings, heavy disk IO per disk is usually caused by high task concurrency. However, it is likely to seem to be problematic.

      When i tested the leaf task scan by 2(concurrency of SATA disk), the performance was better. if you have SAS Storage or SSD, you can increase the disk concurrency. This patch changes the default resource allocation policy to use disk resource.

      The following configs have been available so far:

      • tajo.worker.resource.disks - available disk resource of each worker
      • tajo.task.disk-slot.default - how many disk resource is consumed per task

      Below config is newly introduced in this patch

      • tajo.worker.resource.dfs-dir-aware - it can be true/false. If it is true, each worker uses the number of HDFS datanode's data dirs in the worker as the disk resource. So, tajo.worker.resource.disks is ignored.

      Attachments

        1. TAJO-743.patch
          6 kB
          Jinho Kim
        2. TAJO-743_branch-0.8.0.patch
          6 kB
          Jinho Kim

        Activity

          People

            jhkim Jinho Kim
            jhkim Jinho Kim
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: