Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4313

Define general benchmark database schema

    XMLWordPrintableJSON

Details

    Description

      Some possible attributes that the benchmark database should track, to permit heterogeneity of hardware and programming languages

      • Timestamp of benchmark run
      • Git commit hash of codebase
      • Machine unique name (sort of the "user id")
      • CPU identification for machine, and clock frequency (in case of overclocking)
      • CPU cache sizes (L1/L2/L3)
      • Whether or not CPU throttling is enabled (if it can be easily determined)
      • RAM size
      • GPU identification (if any)
      • Benchmark unique name
      • Programming language(s) associated with benchmark (e.g. a benchmark
        may involve both C++ and Python)
      • Benchmark time, plus mean and standard deviation if available, else NULL

      see discussion on mailing list https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E

      Attachments

        1. benchmark-data-model.erdplus
          15 kB
          Tanya Schlusser
        2. benchmark-data-model.png
          146 kB
          Tanya Schlusser

        Issue Links

          Activity

            People

              tanya Tanya Schlusser
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 19h 50m
                  19h 50m