Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6117

Dataflow Slowness

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Information Provided
    • None
    • Not applicable
    • sdk-go
    • None

    Description

      This is a pretty open ended ticket but we've been struggling with this for quite some time and hoping we can get assistance in getting our issue resolved.

       

      We wrote and contributed the datastore reader earlier this year and have been using it in our project in a couple of scenarios with success. The problem that we are facing is that our dataflows take a long time. We have datastore kinds that are 100M+ and they take 2-3 days to go over. We've try fiddling with all of the knobs available to us(datastore splits, cpus, turning off autoscaling, scope changes, updating libraries, etc...) and can't seem to make it go faster.

      My only hunch is that within the datastore reader when viewing the status in dataflows ui. Is that we see:

      Output collections
      DailyListingScore/main.queryFn.out0
      Elements added

      Estimated size

      I am assuming that these numbers would indicate to dataflow the progress that the step is making and scale up/down dependent on these numbers.  Is this right? Or would these numbers have no bearing?  We've tried starting the dataflow with 32+ workers and it will always scale down to 1-2 nodes after a couple of minutes. It seems as though dataflow isn't scaling up when it should. Any directions or assistance in getting this issue solved would be great!

       

      Thanks

       

       

       

       

       

      Attachments

        1. Screen Shot 2018-11-22 at 7.08.08 PM.png
          151 kB
          Braden Bassingthwaite
        2. Screen Shot 2018-11-22 at 7.08.32 PM.png
          212 kB
          Braden Bassingthwaite
        3. Screen Shot 2018-11-22 at 7.11.33 PM.png
          37 kB
          Braden Bassingthwaite

        Activity

          People

            lostluck Robert Burke
            bbassingthwaite Braden Bassingthwaite
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: