Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-974

Path - performance testing

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • v1.9.1
    • Module: Utilities
    • None

    Description

      Story

      As a developer, I want to do performance testing on the Path algorithm so that I can understand and communicate scale effects to users.

      The proposed matrix for the 1st set of tests is:

      1) overall data size, i.e., number of rows in data sets = 1M, 10M, 100M
      2) number of partitions = 1k, 10k, 100k
      3) number of matches per partition = 1k, 10k, 100k

      The proposed matrix for the 2nd set of tests is:

      4) match "thickness", i.e., number of rows in match = 1, 1k, 10k
      5) number of symbols = 5, 15, 25

      Acceptance

      1) Please plot performance curves. Do not need to run all permutations to keep the size of the test matrix reasonable.
      E.g., when plotting the effect of number of partitions (#2 above), can fix data size at 10M (say) and number of matches per partition to 1k (say).

      Other

      1) Can use attached data set as a baseline for duplication/fabrication.

      2) Another useful data set is at
      http://csr.lanl.gov/data/auth/

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            xctang Xiaocheng Tang
            fmcquillan Frank McQuillan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment