Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10166

Improve execution time errors

Details

    Description

      The Go SDK uses errors returned by DoFns to signal failures to process bundles, and terminate bundle processing. However, if the preceding DoFn uses emitters, rather than error returns, the code has no choice to panic to avoid user code handling or ignoring the cross DoFn error (which could cause dataloss or other correctness problems).

      All bundle executions are wrapped in `callNoPanic` to prevent worker termination on such panics, and orderly terminate just the affected bundle instead.`callNoPanic` uses Go's built in recover mechanism to get the error and provide a stack trace.

      We can do better.

      The value returned by recover is just an interface{} which means we could detect the specific type of error it is. In particular, we could have the exec package have an error that we can detect. If the recovered value is that error, then we could use that to provide a clearer error message than a panic stack trace.
      Such an error wrapper would contain: the error in question, the user DoFn that caused it, the debug id of the DoFn node (To be able to relate it back to the plan.)

      See https://gobyexample.com/errors and other articles on creating custom errors in Go. It doesn't need to be complicated.

      Then in `callNoPanic` we could detect this error wrapper and produce a clearer error message based on the existing plan. If not, we can maintain the current behavior. This latter part is necessary to handle panics originating in user code.
      To avoid mistaken user use which would breach this protocol, we're best off keeping the wrapper unexported from the exec package.

      Attachments

        Activity

          People

            riteshghorse Ritesh Ghorse
            lostluck Robert Burke
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 10m
                3h 10m