Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-13004

Trogdor performance decreases sharply with large amounts of tasks.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • tools
    • None
    • We run our Trogdor clusters within Kubernetes.

    Description

      As part of my performance tests, I am running 3000 workloads within Trogdor.  The clients seem to be able to handle this fine, but when I go to reset and run the same test again, Trogdor seems sluggish.

      Here are the steps to reproduce this:

      1. Run 3000 workloads in Trogdor, a combination of Produce/Consume workloads.
      2. Wait for the workloads to complete.
      3. Run the DELETE API calls to destroy all 3000 workloads to reset for the next run.
      4. Confirm via the API that there are no workloads defined in the system.
      5. Run an additional 3000 workloads in Trogdor similar to step 1.

      The Coordinator takes a long time to start the second batch of 3000. There seems to be some performance issue in the framework that will take a while to debug. At this point I don't know if it only affects the Coordinator, or if the Agents are affected as well. I do not currently have the time to look into this, so I am creating this issue to track it.

      The workaround I am employing is destroying and recreating the Trogdor cluster in between test runs.

      Attachments

        Activity

          People

            scott.hendricks Scott Hendricks
            scott.hendricks Scott Hendricks
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: