Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0
    • Fix Version/s: 1.0
    • Component/s: Parser
    • Labels:
      None

      Description

      Add isDelimiter(c) and isEncapsulator(c) to CSVLexer

      1. CSV-71.patch
        3 kB
        Benedikt Ritter
      2. Emmanuels_Performance_Test.patch
        3 kB
        Benedikt Ritter

        Activity

        Hide
        Benedikt Ritter added a comment -

        I've added a patch for this issue.

        Show
        Benedikt Ritter added a comment - I've added a patch for this issue.
        Hide
        Emmanuel Bourg added a comment -

        How is the performance with this patch?

        Show
        Emmanuel Bourg added a comment - How is the performance with this patch?
        Hide
        Benedikt Ritter added a comment - - edited

        On my machine the performance tests takes around 5 secs. I get the following results:

          min max avg
        before 5103 5260 5170
        after 5091 5238 5142

        At least on my machine there is no negative impact. I ran the test with VM param -server. I'm using Java version: 1.7.0_01, vendor: Oracle Corporation

        Show
        Benedikt Ritter added a comment - - edited On my machine the performance tests takes around 5 secs. I get the following results:   min max avg before 5103 5260 5170 after 5091 5238 5142 At least on my machine there is no negative impact. I ran the test with VM param -server. I'm using Java version: 1.7.0_01, vendor: Oracle Corporation
        Hide
        Sebb added a comment -

        What performance test is that? On my system the tests take about 3 times as long!
        But that is using Java 1.6 on an old XP laptop.

        Can you attach the code; it would help to ensure we use the same tests.

        Show
        Sebb added a comment - What performance test is that? On my system the tests take about 3 times as long! But that is using Java 1.6 on an old XP laptop. Can you attach the code; it would help to ensure we use the same tests.
        Hide
        Benedikt Ritter added a comment -

        The one that Emmanuel posted a while ago on the ML. I'm using Windows 7 on a laptop with Core i3 M350 @ 2.27GHz and 4GB RAM. If the patch is that much slower on your system, it is likely that it will be slow on others. So we shouldn't apply it.

        I've added Emmanuel's test to the ticket.

        Show
        Benedikt Ritter added a comment - The one that Emmanuel posted a while ago on the ML. I'm using Windows 7 on a laptop with Core i3 M350 @ 2.27GHz and 4GB RAM. If the patch is that much slower on your system, it is likely that it will be slow on others. So we shouldn't apply it. I've added Emmanuel's test to the ticket.
        Hide
        Sebb added a comment -

        I meant that all the tests take about 3 times longer (including the current code).

        I've not tried the patch yet so I don't know if it is faster or slower or about the same on my system.

        Show
        Sebb added a comment - I meant that all the tests take about 3 times longer (including the current code). I've not tried the patch yet so I don't know if it is faster or slower or about the same on my system.
        Hide
        Benedikt Ritter added a comment -

        Ah okay, I got that wrong.
        Then I guess it has something to do with your system. The tests take about the same time, they took a few days ago, on my machine.

        Show
        Benedikt Ritter added a comment - Ah okay, I got that wrong. Then I guess it has something to do with your system. The tests take about the same time, they took a few days ago, on my machine.
        Hide
        Sebb added a comment -

        Just tried on my system, and I see similar performance both with and without the patch.

        Likewise if the patch is extended to check whether the character is enabled or not, and using class fields, for example:

        private boolean isEscape(int c)

        { return escaping && c == escape; }
        Show
        Sebb added a comment - Just tried on my system, and I see similar performance both with and without the patch. Likewise if the patch is extended to check whether the character is enabled or not, and using class fields, for example: private boolean isEscape(int c) { return escaping && c == escape; }
        Hide
        Benedikt Ritter added a comment -

        Feel free to add that, if you want to apply the patch. Or do you want me to modify the patch and re-submit it?

        Show
        Benedikt Ritter added a comment - Feel free to add that, if you want to apply the patch. Or do you want me to modify the patch and re-submit it?
        Hide
        Sebb added a comment -

        Fixed. In my testing, the new CSVLexer is about 4% faster than the old one (kept temporarily as CSVServer1 in the test tree).

        To test:

        set classpath=target/classes;target/test-classes
        java -server org.apache.commons.csv.PerformanceTest 10 CSVServer CSVServer1
        
        Show
        Sebb added a comment - Fixed. In my testing, the new CSVLexer is about 4% faster than the old one (kept temporarily as CSVServer1 in the test tree). To test: set classpath=target/classes;target/test-classes java -server org.apache.commons.csv.PerformanceTest 10 CSVServer CSVServer1

          People

          • Assignee:
            Unassigned
            Reporter:
            Benedikt Ritter
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development