Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1586

TajoMaster HA startup failure on Yarn.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.11.0, 0.10.1
    • Component/s: TajoMaster
    • Labels:
      None

      Description

      I tried to deploy Tajo on YARN with Slider. But I couldn't deploy Tajo because of TajoMaster HA failure. TajoWorker failed to load TajoMaster address as follows.

      2015-04-28 04:52:22,266 INFO org.apache.hadoop.service.AbstractService: Service org.apache.tajo.worker.TajoWorker failed in state STARTED; cause: org.apache.tajo.service.ServiceTrackerException: org.apache.tajo.service.ServiceTrackerException: No active master entry
      org.apache.tajo.service.ServiceTrackerException: org.apache.tajo.service.ServiceTrackerException: No active master entry
      	at org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:441)
      	at org.apache.tajo.ha.HdfsServiceTracker.getUmbilicalAddress(HdfsServiceTracker.java:348)
      	at org.apache.tajo.worker.TajoWorker.serviceStart(TajoWorker.java:318)
      	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
      	at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
      	at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627)
      Caused by: org.apache.tajo.service.ServiceTrackerException: No active master entry
      	at org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:413)
      	... 5 more
      2015-04-28 04:52:22,307 INFO org.apache.hadoop.service.AbstractService: Service WorkerHeartbeatService failed in state STOPPED; cause: java.lang.NullPointerException
      java.lang.NullPointerException
      	at org.apache.tajo.worker.WorkerHeartbeatService$WorkerHeartbeatThread.access$000(WorkerHeartbeatService.java:101)
      	at org.apache.tajo.worker.WorkerHeartbeatService.serviceStop(WorkerHeartbeatService.java:90)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
      	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
      	at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
      	at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
      	at org.apache.tajo.worker.TajoWorker.serviceStop(TajoWorker.java:375)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
      	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
      	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:203)
      	at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
      	at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627)

      I think that the cause of this failure is time difference between TajoMaster and TajoWorker.

        Attachments

        1. TAJO-1586_3.patch
          58 kB
          Jaehwa Jung
        2. TAJO-1586_2.patch
          58 kB
          Jaehwa Jung
        3. TAJO-1586.patch
          15 kB
          Jaehwa Jung

          Activity

            People

            • Assignee:
              blrunner Jaehwa Jung
              Reporter:
              blrunner Jaehwa Jung
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: