Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1586

TajoMaster HA startup failure on Yarn.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.0
    • 0.11.0, 0.10.1
    • TajoMaster
    • None

    Description

      I tried to deploy Tajo on YARN with Slider. But I couldn't deploy Tajo because of TajoMaster HA failure. TajoWorker failed to load TajoMaster address as follows.

      2015-04-28 04:52:22,266 INFO org.apache.hadoop.service.AbstractService: Service org.apache.tajo.worker.TajoWorker failed in state STARTED; cause: org.apache.tajo.service.ServiceTrackerException: org.apache.tajo.service.ServiceTrackerException: No active master entry
      org.apache.tajo.service.ServiceTrackerException: org.apache.tajo.service.ServiceTrackerException: No active master entry
      	at org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:441)
      	at org.apache.tajo.ha.HdfsServiceTracker.getUmbilicalAddress(HdfsServiceTracker.java:348)
      	at org.apache.tajo.worker.TajoWorker.serviceStart(TajoWorker.java:318)
      	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
      	at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
      	at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627)
      Caused by: org.apache.tajo.service.ServiceTrackerException: No active master entry
      	at org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:413)
      	... 5 more
      2015-04-28 04:52:22,307 INFO org.apache.hadoop.service.AbstractService: Service WorkerHeartbeatService failed in state STOPPED; cause: java.lang.NullPointerException
      java.lang.NullPointerException
      	at org.apache.tajo.worker.WorkerHeartbeatService$WorkerHeartbeatThread.access$000(WorkerHeartbeatService.java:101)
      	at org.apache.tajo.worker.WorkerHeartbeatService.serviceStop(WorkerHeartbeatService.java:90)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
      	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
      	at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
      	at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
      	at org.apache.tajo.worker.TajoWorker.serviceStop(TajoWorker.java:375)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
      	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
      	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:203)
      	at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
      	at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627)

      I think that the cause of this failure is time difference between TajoMaster and TajoWorker.

      Attachments

        1. TAJO-1586.patch
          15 kB
          JaeHwa Jung
        2. TAJO-1586_3.patch
          58 kB
          JaeHwa Jung
        3. TAJO-1586_2.patch
          58 kB
          JaeHwa Jung

        Activity

          People

            blrunner JaeHwa Jung
            blrunner JaeHwa Jung
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: