Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10659

ParDo Python streaming load tests timeouts on 200-iterations case

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P2
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Not applicable
    • Component/s: testing
    • Labels:
      None

      Description

      Running Python Dataflow load test in streaming option timeouts on Jenkins on case 2:

      2GB 100 byte records 200 times
      

      It iterates same ParDo step sequentially. 

      Jenkins jobs has 2h timeout. Second case usually is cancelled after 1h 47 min. The most suspicious metric here is throughput which in comparison to other jobs doesn't look steady. Sometimes there are spike after 1 hour of non action, or there are several spikes (to 30 000 elements/sec).

      Python batch case scenario takes ~56 minutes, with steady throughput ~7000 elements/sec for almost whole job run.

      In comparison Java same test case takes ~6 minutes. Here throughput goes up to ~100 000 elements/sec then after processing all elements it decreases.

       

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                kasiak Kasia Kucharczyk
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: