Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: QueryMaster
    • Labels:
      None

      Description

      The task scheduler can be changed according to the task scheduling algorithm, the locality policy of the storage, and so on.
      Thus, we need to improve the task scheduler interface to be pluggable.

      1. TAJO-314_3.patch
        76 kB
        Jihoon Son
      2. TAJO-314_2.patch
        52 kB
        Jihoon Son
      3. TAJO-314.patch
        52 kB
        Jihoon Son

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-trunk-postcommit #566 (See https://builds.apache.org/job/Tajo-trunk-postcommit/566/)
        TAJO-314: Make TaskScheduler be pluggable. (jihoon) (jihoonson: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=92674542d03e0c3109950cd3da5f16d1c786425d)

        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerImpl.java
        • tajo-core/tajo-core-backend/src/main/resources/tajo-default.xml
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerFactory.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptAssignedEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptScheduleEvent.java
        • CHANGES.txt
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskSchedulerEventFactory.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskCompletionEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnitAttempt.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskScheduleEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskFatalErrorEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/DefaultTaskSchedulerEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TajoWorkerManagerService.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptStatusUpdateEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/AbstractTaskScheduler.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskSchedulerEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskScheduler.java
          TAJO-314: Make TaskScheduler be pluggable. (fixed missing changes in the configuration) (jihoonson: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=67e0d94c4828b014b3b83a6713b172b482b63a6a)
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskFatalErrorEvent.java
        • tajo-core/tajo-core-backend/src/main/resources/tajo-default.xml
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptAssignedEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskCompletionEvent.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerFactory.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptStatusUpdateEvent.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-trunk-postcommit #566 (See https://builds.apache.org/job/Tajo-trunk-postcommit/566/ ) TAJO-314 : Make TaskScheduler be pluggable. (jihoon) (jihoonson: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=92674542d03e0c3109950cd3da5f16d1c786425d ) tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerImpl.java tajo-core/tajo-core-backend/src/main/resources/tajo-default.xml tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerFactory.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptAssignedEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptScheduleEvent.java CHANGES.txt tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskSchedulerEventFactory.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskCompletionEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnitAttempt.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskScheduleEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskFatalErrorEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/DefaultTaskSchedulerEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TajoWorkerManagerService.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptStatusUpdateEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/AbstractTaskScheduler.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskSchedulerEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskScheduler.java TAJO-314 : Make TaskScheduler be pluggable. (fixed missing changes in the configuration) (jihoonson: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=67e0d94c4828b014b3b83a6713b172b482b63a6a ) tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskFatalErrorEvent.java tajo-core/tajo-core-backend/src/main/resources/tajo-default.xml tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptAssignedEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskCompletionEvent.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerFactory.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/TaskAttemptStatusUpdateEvent.java
        Hide
        jihoonson Jihoon Son added a comment -

        I'm sorry that I missed some changes in tajo-default.xml.
        After fixing missed changes and removing some unused imports, I committed again.

        Show
        jihoonson Jihoon Son added a comment - I'm sorry that I missed some changes in tajo-default.xml. After fixing missed changes and removing some unused imports, I committed again.
        Hide
        jihoonson Jihoon Son added a comment -

        Thanks.
        I committed this patch.

        Show
        jihoonson Jihoon Son added a comment - Thanks. I committed this patch.
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1 for the latest patch

        I'm expecting that this interface will be more improved when Tajo uses various storages, such as HBase. Nevertheless, I think that this is a good start.

        Show
        hyunsik Hyunsik Choi added a comment - +1 for the latest patch I'm expecting that this interface will be more improved when Tajo uses various storages, such as HBase. Nevertheless, I think that this is a good start.
        Hide
        jihoonson Jihoon Son added a comment -

        I uploaded the third patch after fixing all the bugs.
        It passed 'mvn verify'.
        In this patch, I added a TajoConf variable to QueryUnit, because QueryUnitAttempt needs the configuration to create TaskSchedulerEvent using TaskSchedulerEventFactory.

        Show
        jihoonson Jihoon Son added a comment - I uploaded the third patch after fixing all the bugs. It passed 'mvn verify'. In this patch, I added a TajoConf variable to QueryUnit, because QueryUnitAttempt needs the configuration to create TaskSchedulerEvent using TaskSchedulerEventFactory.
        Hide
        jihoonson Jihoon Son added a comment -

        Some changes that incur a runtime error are missed in this patch.
        I'll upload another patch after fixing that.

        Show
        jihoonson Jihoon Son added a comment - Some changes that incur a runtime error are missed in this patch. I'll upload another patch after fixing that.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I also verified 'mvn clean install'.

        Show
        hyunsik Hyunsik Choi added a comment - I also verified 'mvn clean install'.
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1
        For more elaborate custom schedulers, we also need to create custom schedule event with custom fragment. We need more work for this. This patch would be good start.

        Show
        hyunsik Hyunsik Choi added a comment - +1 For more elaborate custom schedulers, we also need to create custom schedule event with custom fragment. We need more work for this. This patch would be good start.
        Hide
        jihoonson Jihoon Son added a comment -

        I attached the second patch after rebasing.
        In this patch, I added the scheduler event configuration to pom.xml as mentioned above.

        Please review this patch.

        Show
        jihoonson Jihoon Son added a comment - I attached the second patch after rebasing. In this patch, I added the scheduler event configuration to pom.xml as mentioned above. Please review this patch.
        Hide
        jihoonson Jihoon Son added a comment -

        Or, we can consider adding an abstract newTaskSchedulerEvent() function to the AbstractTaskScheduler class like this.

        public abstract class AbstractTaskScheduler extends AbstractService
            implements EventHandler<TaskSchedulerEvent> {
        
          /**
           * Construct the service.
           *
           * @param name service name
           */
          public AbstractTaskScheduler(String name) {
            super(name);
          }
        
          public abstract void handleTaskRequestEvent(TaskRequestEvent event);
          public abstract TaskSchedulerEvent newTaskSchedulerEvent(EventType type, QueryUnitAttempt queryUnitAttempt);
        }
        
        Show
        jihoonson Jihoon Son added a comment - Or, we can consider adding an abstract newTaskSchedulerEvent() function to the AbstractTaskScheduler class like this. public abstract class AbstractTaskScheduler extends AbstractService implements EventHandler<TaskSchedulerEvent> { /** * Construct the service. * * @param name service name */ public AbstractTaskScheduler( String name) { super (name); } public abstract void handleTaskRequestEvent(TaskRequestEvent event); public abstract TaskSchedulerEvent newTaskSchedulerEvent(EventType type, QueryUnitAttempt queryUnitAttempt); }
        Hide
        jihoonson Jihoon Son added a comment - - edited

        Hyunsik, thanks for your comment.
        However, I realized that TaskSchedulerEvent also can be changed according to the task scheduling algorithm, because the required information depends on the algorithms.
        So, how about the following configurations?

          <!--- Registered Scheduler Handler -->
          <property>
            <name>tajo.querymaster.task-scheduler-handler</name>
            <value>default</value>
          </property>
        
          <!--- Scheduler Configuration -->
          <property>
            <name>tajo.querymaster.task-scheduler.type</name>
            <value>default</value>
          </property>
        
          <!--- Scheduler Handler -->
          <property>
            <name>tajo.querymaster.task-scheduler.default.class</name>
            <value>org.apache.tajo.master.DefaultTaskScheduler</value>
          </property>
        
          <!-- Scheduler Event handler -->
          <property>
            <name>tajo.querymaster.task-schedule-event.default.class</name>
            <value>org.apache.tajo.master.event.DefaultSchedulerEvent</value>
          </property>
        
        Show
        jihoonson Jihoon Son added a comment - - edited Hyunsik, thanks for your comment. However, I realized that TaskSchedulerEvent also can be changed according to the task scheduling algorithm, because the required information depends on the algorithms. So, how about the following configurations? <!--- Registered Scheduler Handler --> <property> <name>tajo.querymaster.task-scheduler-handler</name> <value> default </value> </property> <!--- Scheduler Configuration --> <property> <name>tajo.querymaster.task-scheduler.type</name> <value> default </value> </property> <!--- Scheduler Handler --> <property> <name>tajo.querymaster.task-scheduler. default .class</name> <value>org.apache.tajo.master.DefaultTaskScheduler</value> </property> <!-- Scheduler Event handler --> <property> <name>tajo.querymaster.task-schedule-event. default .class</name> <value>org.apache.tajo.master.event.DefaultSchedulerEvent</value> </property>
        Hide
        hyunsik Hyunsik Choi added a comment -

        I missed to give +1 for this trial. I like this trial.

        Show
        hyunsik Hyunsik Choi added a comment - I missed to give +1 for this trial. I like this trial.
        Hide
        hyunsik Hyunsik Choi added a comment -

        The main reason why the storage manager uses handler names and a map to indicate an actual class name is to enable users to not specify very long canonical class names within a SQL statement. However, for scheduler, I think that just a class name may be enough. Also, I would like to suggest 'tajo.querymaster.task-scheduler.class' because TaskScheduler works in querymaster.

        I'll continue to review the remain parts.

        Show
        hyunsik Hyunsik Choi added a comment - The main reason why the storage manager uses handler names and a map to indicate an actual class name is to enable users to not specify very long canonical class names within a SQL statement. However, for scheduler, I think that just a class name may be enough. Also, I would like to suggest 'tajo.querymaster.task-scheduler.class' because TaskScheduler works in querymaster. I'll continue to review the remain parts.
        Hide
        jihoonson Jihoon Son added a comment -

        In this patch, I added following features.

        • TaskScheduler
          • Change to an abstract class that extends AbstractService
          • Rename to AbstractTaskScheduler
        • TaskSchedulerImpl
          • Rename to DefaultTaskScheduler
        • Add TaskSchedulerFactory that creates an instance of TaskScheduler according to the following configurations
        • Add the following configurations to tajo-default.xml
          <!--- Registered Scheduler Handler -->
          <property>
            <name>tajo.master.scheduler-handler</name>
            <value>default</value>
          </property>
          
          <!--- Scheduler Configuration -->
          <property>
            <name>tajo.master.scheduler.type</name>
            <value>default</value>
          </property>
          
          <!--- Scheduler Handler -->
          <property>
            <name>tajo.master.scheduler-handler.default.class</name>
            <value>org.apache.tajo.master.DefaultTaskScheduler</value>
          </property>
          
        Show
        jihoonson Jihoon Son added a comment - In this patch, I added following features. TaskScheduler Change to an abstract class that extends AbstractService Rename to AbstractTaskScheduler TaskSchedulerImpl Rename to DefaultTaskScheduler Add TaskSchedulerFactory that creates an instance of TaskScheduler according to the following configurations Add the following configurations to tajo-default.xml <!--- Registered Scheduler Handler --> <property> <name>tajo.master.scheduler-handler</name> <value> default </value> </property> <!--- Scheduler Configuration --> <property> <name>tajo.master.scheduler.type</name> <value> default </value> </property> <!--- Scheduler Handler --> <property> <name>tajo.master.scheduler-handler. default .class</name> <value>org.apache.tajo.master.DefaultTaskScheduler</value> </property>

          People

          • Assignee:
            jihoonson Jihoon Son
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development