Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10988

Improve debugging / visibility of job state

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Runtime / Task
    • Labels:
      None

      Description

      When a Flink Job is running and encounters an unexpected exception, either through processing an expected message, or a message that may be well formed, but the state of the job renders a exception.  It can be difficult to diagnose the cause of the issue.  For example I would get a NPE in one of the Operators:

      2018-11-13 10:10:26,332 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process-Broadcast-Keyed -> Map -> Map -> Sin

      k: Unnamed (1/1) (9a8f3b970570742b7b174a01a9bb1405) switched from RUNNING to FAILED.
      java.lang.NullPointerException
      at com.celertech.analytics.flink.topology.marketimpact.PriceUtils.findPriceForEntryType(PriceUtils.java:28)
      at com.celertech.analytics.flink.topology.marketimpact.PriceUtils.getPriceForMarketDataEntryType(PriceUtils.java:18)
      at com.celertech.analytics.flink.function.midrate.MidRateBroadcaster.processBroadcastElement(MidRateBroadcaster.java:77)
      at com.celertech.analytics.flink.function.midrate.MidRateTagKeyedBroadcastProcessFunction.processBroadcastElement(MidRateTagKeyedBroa
      dcastProcessFunction.java:36)
      at com.celertech.analytics.flink.function.midrate.MidRateTagKeyedBroadcastProcessFunction.processBroadcastElement(MidRateTagKeyedBroa
      dcastProcessFunction.java:12)
      at org.apache.flink.streaming.api.operators.co.CoBroadcastWithKeyedOperator.processElement2(CoBroadcastWithKeyedOperator.java:121)

       

      An improvement to this would be to allow the printing of the incoming message so the developer can diagnose if that message was correct.  Printing of the state of the job would be nice as well just in case the state of the job was incorrect leading to the exception

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              scottsue Scott Sue
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: