[KAFKA-13004] Trogdor performance decreases sharply with large amounts of tasks. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: tools
Labels:
None
Environment:
We run our Trogdor clusters within Kubernetes.

Description

As part of my performance tests, I am running 3000 workloads within Trogdor. The clients seem to be able to handle this fine, but when I go to reset and run the same test again, Trogdor seems sluggish.

Here are the steps to reproduce this:

Run 3000 workloads in Trogdor, a combination of Produce/Consume workloads.
Wait for the workloads to complete.
Run the DELETE API calls to destroy all 3000 workloads to reset for the next run.
Confirm via the API that there are no workloads defined in the system.
Run an additional 3000 workloads in Trogdor similar to step 1.

The Coordinator takes a long time to start the second batch of 3000. There seems to be some performance issue in the framework that will take a while to debug. At this point I don't know if it only affects the Coordinator, or if the Agents are affected as well. I do not currently have the time to look into this, so I am creating this issue to track it.

The workaround I am employing is destroying and recreating the Trogdor cluster in between test runs.

Attachments

Activity

People

Assignee:: Scott Hendricks

Reporter:: Scott Hendricks

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 28/Jun/21 21:21

Updated:: 28/Jun/21 21:21