Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-36645

Gracefully handle null execution plan from autoscaler

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Not a Priority
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Ocassionally, we see error logs like below from the autoscaler module . It is because the execution plan returned is null from the job manager rest API when a scaling action is already in progress.

       

      
      Error while scaling job
      java.lang.NullPointerException: Cannot invoke "org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.ArrayNode.iterator()" because "nodes" is null at o.a.f.a.topology.JobTopology.fromJsonPlan(JobTopology.java:155) at o.a.f.a.ScalingMetricCollector.getJobTopology(ScalingMetricCollector.java:248) at o.a.f.a.ScalingMetricCollector.getJobTopology(ScalingMetricCollector.java:203) at o.a.f.a.ScalingMetricCollector.updateMetrics(ScalingMetricCollector.java:121) at o.a.f.a.JobAutoScalerImpl.runScalingLogic(JobAutoScalerImpl.java:178) at o.a.f.a.JobAutoScalerImpl.scale(JobAutoScalerImpl.java:103) at c.u.a.c.p.s.YarnAutoScalerListener.stateHistoryMatch(YarnAutoScalerListener.java:72) at c.u.a.w.c.c.WatchdogContextImpl.run(WatchdogContextImpl.java:165) at j.u.c.Executors$RunnableAdapter.call(Executors.java:539) at j.util.concurrent.FutureTask.run(FutureTask.java:264) at j.u.c.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at... 

      We need to handle this case more gracefully by detecting the job status rather than log NPE message.

      Attachments

        Activity

          People

            dsaisharath Sai Sharath Dandi
            dsaisharath Sai Sharath Dandi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: