Avro
  1. Avro
  2. AVRO-752

Java: Enhanced Performance Test Suite

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:
      None

      Description

      The Perf.java performance test can be improved. Notably, it is only useful for testing Decoders at the moment. Upgrading it to run tests more consistently, support both read (Decoder) and write (Encoder) tests, and be more flexible in general will help us performance tune and spot regressions more easily.

      1. AVRO-752.patch
        47 kB
        Scott Carey
      2. AVRO-752.patch-2
        3 kB
        Scott Carey

        Activity

        Hide
        Scott Carey added a comment - - edited

        Major refactoring of the Perf test.
        Includes write (Encoder) performance tests, more read tests, and more consistency across all tests.
        Names are simplified and many more command-line parameters are available.
        The output if you do not specify a valid command line is:

        Usage: Perf { -nowrite | -noread | -basic | -i | -ls | -l | -f | -d | -b | -by | -s | -record | -R | -Rv | -Rr | -Rd | -Ro | -Rp | -generic | -G | -Gn | -Gd | -Go | -Gp | -generic-onetime | -Gotd | -Gotr | -Got }
        
         -nowrite   (do not execute write tests)
         -noread   (do not execute write tests)
         -basic   (executes all basic tests):
              -i  (IntTest)
              -ls  (SmallLongTest)
              -l  (LongTest)
              -f  (FloatTest)
              -d  (DoubleTest)
              -b  (BoolTest)
              -by  (BytesTest)
              -s  (StringTest)
         -record   (executes all record tests):
              -R  (RecordTest)
              -Rv  (ValidatingRecord)
              -Rr  (ResolvingRecord)
              -Rd  (RecordWithDefault)
              -Ro  (RecordWithOutOfOrder)
              -Rp  (RecordWithPromotion)
         -generic   (executes all generic tests):
              -G  (GenericTest)
              -Gn  (GenericNested)
              -Gd  (GenericWithDefault)
              -Go  (GenericWithOutOfOrder)
              -Gp  (GenericWithPromotion)
         -generic-onetime   (executes all generic-onetime tests):
              -Gotd  (GenericOneTimeDecoderUse)
              -Gotr  (GenericOneTimeReaderUse)
              -Got  (GenericOneTimeUse)
        
        Show
        Scott Carey added a comment - - edited Major refactoring of the Perf test. Includes write (Encoder) performance tests, more read tests, and more consistency across all tests. Names are simplified and many more command-line parameters are available. The output if you do not specify a valid command line is: Usage: Perf { -nowrite | -noread | -basic | -i | -ls | -l | -f | -d | -b | -by | -s | -record | -R | -Rv | -Rr | -Rd | -Ro | -Rp | -generic | -G | -Gn | -Gd | -Go | -Gp | -generic-onetime | -Gotd | -Gotr | -Got } -nowrite (do not execute write tests) -noread (do not execute write tests) -basic (executes all basic tests): -i (IntTest) -ls (SmallLongTest) -l (LongTest) -f (FloatTest) -d (DoubleTest) -b (BoolTest) -by (BytesTest) -s (StringTest) -record (executes all record tests): -R (RecordTest) -Rv (ValidatingRecord) -Rr (ResolvingRecord) -Rd (RecordWithDefault) -Ro (RecordWithOutOfOrder) -Rp (RecordWithPromotion) -generic (executes all generic tests): -G (GenericTest) -Gn (GenericNested) -Gd (GenericWithDefault) -Go (GenericWithOutOfOrder) -Gp (GenericWithPromotion) -generic-onetime (executes all generic-onetime tests): -Gotd (GenericOneTimeDecoderUse) -Gotr (GenericOneTimeReaderUse) -Got (GenericOneTimeUse)
        Hide
        Scott Carey added a comment -

        To run Perf.java via maven, use the maven exec plugin.

        for info:

        mvn exec:help
        mvn exec:help -Dgoal=java
        

        and http://mojo.codehaus.org/exec-maven-plugin/usage.html

        to run only the write tests:

        cd lang/java/avro
        mvn exec:java -Dexec.mainClass="org.apache.avro.io.Perf" -Dexec.classpathScope=test -Dexec.args="-noread"
        

        We can simplify this later with a profile in the pom that sets up the class and classpath, or wire it up another way.

        Show
        Scott Carey added a comment - To run Perf.java via maven, use the maven exec plugin. for info: mvn exec:help mvn exec:help -Dgoal=java and http://mojo.codehaus.org/exec-maven-plugin/usage.html to run only the write tests: cd lang/java/avro mvn exec:java -Dexec.mainClass="org.apache.avro.io.Perf" -Dexec.classpathScope=test -Dexec.args="-noread" We can simplify this later with a profile in the pom that sets up the class and classpath, or wire it up another way.
        Hide
        Doug Cutting added a comment -

        +1 This looks like a good improvement and works for me.

        Ideally we might run performance benchmarks regularly on trunk and graph the results. Hudson could do this for us, I think.

        Show
        Doug Cutting added a comment - +1 This looks like a good improvement and works for me. Ideally we might run performance benchmarks regularly on trunk and graph the results. Hudson could do this for us, I think.
        Hide
        Scott Carey added a comment -

        Great. I'm going to commit this and leave the ticket open for a few days in case anyone finds any minor glitches or enhancements to add to this ticket.

        Show
        Scott Carey added a comment - Great. I'm going to commit this and leave the ticket open for a few days in case anyone finds any minor glitches or enhancements to add to this ticket.
        Hide
        Scott Carey added a comment -

        This patch follows on the previous one.
        Changes are:

        • longer test by default, more consistent results.
        • improved output formatting and additional column that reports the size of the encoded data for each test.

        Output now looks like:

        Executing tests: 
        [IntTest, SmallLongTest, LongTest, FloatTest, DoubleTest, BoolTest, BytesTest, StringTest, RecordTest, ValidatingRecord, ResolvingRecord, RecordWithDefault, RecordWithOutOfOrder, RecordWithPromotion, GenericTest, GenericNested, GenericWithDefault, GenericWithOutOfOrder, GenericWithPromotion, GenericOneTimeDecoderUse, GenericOneTimeReaderUse, GenericOneTimeUse]
         readTests:false
         writeTests:true
         cycles=800
                            test name     time    M entries/sec   M bytes/sec  bytes/cycle
                             IntWrite:   5372 ms      37.229        93.717        629325
                       SmallLongWrite:   5502 ms      36.349        91.501        629325
                            LongWrite:   9874 ms      20.254        88.494       1092275
                           FloatWrite:   7337 ms      27.259       109.035       1000000
                          DoubleWrite:  14798 ms      13.515       108.117       2000000
                         BooleanWrite:   2111 ms      94.707        94.707        250000
                           BytesWrite:   2843 ms      14.066       499.879       1776937
                          StringWrite:  12230 ms       3.270       116.487       1780910
                          RecordWrite:  12538 ms       2.659       103.178       1617069
                ValidatingRecordWrite:  14139 ms       2.358        91.495       1617069
                         GenericWrite:  10777 ms       1.546        60.013        808498
                  GenericNested_Write:  12837 ms       1.298        50.384        808498
        
        

        I will commit this change soon and close the ticket if there are no objections tomorrow.

        Show
        Scott Carey added a comment - This patch follows on the previous one. Changes are: longer test by default, more consistent results. improved output formatting and additional column that reports the size of the encoded data for each test. Output now looks like: Executing tests: [IntTest, SmallLongTest, LongTest, FloatTest, DoubleTest, BoolTest, BytesTest, StringTest, RecordTest, ValidatingRecord, ResolvingRecord, RecordWithDefault, RecordWithOutOfOrder, RecordWithPromotion, GenericTest, GenericNested, GenericWithDefault, GenericWithOutOfOrder, GenericWithPromotion, GenericOneTimeDecoderUse, GenericOneTimeReaderUse, GenericOneTimeUse] readTests:false writeTests:true cycles=800 test name time M entries/sec M bytes/sec bytes/cycle IntWrite: 5372 ms 37.229 93.717 629325 SmallLongWrite: 5502 ms 36.349 91.501 629325 LongWrite: 9874 ms 20.254 88.494 1092275 FloatWrite: 7337 ms 27.259 109.035 1000000 DoubleWrite: 14798 ms 13.515 108.117 2000000 BooleanWrite: 2111 ms 94.707 94.707 250000 BytesWrite: 2843 ms 14.066 499.879 1776937 StringWrite: 12230 ms 3.270 116.487 1780910 RecordWrite: 12538 ms 2.659 103.178 1617069 ValidatingRecordWrite: 14139 ms 2.358 91.495 1617069 GenericWrite: 10777 ms 1.546 60.013 808498 GenericNested_Write: 12837 ms 1.298 50.384 808498 I will commit this change soon and close the ticket if there are no objections tomorrow.
        Hide
        Scott Carey added a comment -

        The second change was committed in revision 1068726

        Show
        Scott Carey added a comment - The second change was committed in revision 1068726

          People

          • Assignee:
            Scott Carey
            Reporter:
            Scott Carey
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development