Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6612

PerformanceRegression in QueueingBeamFnDataClient

Details

    • New Feature
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 2.7.1
    • java-fn-execution
    • None

    Description

      Remove QueueingBeamFnDataClient, which made process() calls all run on the same thread.

      lcwik and I came up with this design thinking that it was required to process the bundle in parallel anyways, and we would have good performance. However after speaking to Ken, there is no requirement for a bundle or key to be processed in parallel. Elements are either iterables or single elements which defines the needs for processing a group of elements on the same thread.

      Simply performing this change will lead to the following issues:

      (1) MetricsContainerImpl and MetricsContainer are not thread safe, so when the process() functions enter the metric container context, they will be accessing an thread-unsafe collection in parallel

      (2) An ExecutionStateTracker will be needed in every thread, So we will need to

      create an instance and activate it in every GrpC thread which receives a new element.

      (Will this get sampled properly, since the trackers will be short lived).

      (3) The SimpleExecutionStates being used will need to be thread safe as well? I don't think so, because I don't think that the ExecutionStateSampler invokes them in parallel.

       

      Attachments

        Activity

          People

            ajamato@google.com Alex Amato
            ajamato@google.com Alex Amato
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m