Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Flink jobs could recovery by failover, but the user couldn't see any information about the jobs' failure. There isn't information about the historical attempt.
Proposed Changes
Add SubtaskAllExecutionAttemptsDetailsHandler for failed attempt
- return subtask all attempt and state
- AccessExecutionVertex add method to returns the prior executions
- get prior attempts according to AccessExecutionVertex.getPriorExecutionAttempts
- add SubtaskAllExecutionAttemptsDetailsHandler for prior attempt
- url /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskIndex/attempts
- response:
{ "type" : "object", "id" : "urn:jsonschema:org:apache:flink:runtime:rest:messages:job:SubtaskAllExecutionAttemptsDetailsInfo", "properties" : { "attempts" : { "type" : "array", "items" : { "type" : "object", "id" : "urn:jsonschema:org:apache:flink:runtime:rest:messages:job:SubtaskExecutionAttemptDetailsInfo", "properties" : { "subtask" : { "type" : "integer" }, "status" : { "type" : "string", "enum" : [ "CREATED", "SCHEDULED", "DEPLOYING", "RUNNING", "FINISHED", "CANCELING", "CANCELED", "FAILED", "RECONCILING" ] }, "attempt" : { "type" : "integer" }, "host" : { "type" : "string" }, "start-time" : { "type" : "integer" }, "end-time" : { "type" : "integer" }, "duration" : { "type" : "integer" }, "metrics" : { "type" : "object", "id" : "urn:jsonschema:org:apache:flink:runtime:rest:messages:job:metrics:IOMetricsInfo", "properties" : { "read-bytes" : { "type" : "integer" }, "read-bytes-complete" : { "type" : "boolean" }, "write-bytes" : { "type" : "integer" }, "write-bytes-complete" : { "type" : "boolean" }, "read-records" : { "type" : "integer" }, "read-records-complete" : { "type" : "boolean" }, "write-records" : { "type" : "integer" }, "write-records-complete" : { "type" : "boolean" } } }, "taskmanager-id" : { "type" : "string" }, "start_time" : { "type" : "integer" } } } } } }
Attachments
Issue Links
- links to