Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None

      Description

      When solr is writing the response of large cached documents, the bottleneck is string encoding.
      a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

      1. fastwriter.patch
        15 kB
        Yonik Seeley
      2. SOLR-377-phpresponsewriter.patch
        1 kB
        Pieter Berkel

        Activity

        Hide
        Yonik Seeley added a comment -

        attaching patch... adds an optimized unsynchronized buffered writer, changes some ResponseWriters use of strings to characters, removes buffering of string in JSON, etc.

        Speed differences with very large documents:
        json: 24% faster
        ruby: 500% faster (ruby didn't buffer in a StringBuilder like JSON did)
        python: 0% (bottleneck for these huge fields is buffering in the StringBuilder to see if we should prepend a 'u'... always prepending a 'u' and not buffering resulted in a ~20% improvement)
        xml: 8% faster

        With smaller documents, the speedups are likely to be greater because small writes like value separators would matter more.

        If there are no objections, I'll commit in a few days.

        Show
        Yonik Seeley added a comment - attaching patch... adds an optimized unsynchronized buffered writer, changes some ResponseWriters use of strings to characters, removes buffering of string in JSON, etc. Speed differences with very large documents: json: 24% faster ruby: 500% faster (ruby didn't buffer in a StringBuilder like JSON did) python: 0% (bottleneck for these huge fields is buffering in the StringBuilder to see if we should prepend a 'u'... always prepending a 'u' and not buffering resulted in a ~20% improvement) xml: 8% faster With smaller documents, the speedups are likely to be greater because small writes like value separators would matter more. If there are no objections, I'll commit in a few days.
        Hide
        Yonik Seeley added a comment -

        committed.

        Show
        Yonik Seeley added a comment - committed.
        Hide
        Pieter Berkel added a comment -

        Sorry I've been a bit slow catching up with this issue. Please find attached a trival patch to PHPResponseWriter.java that takes advantage of the new FastWriter code, it should provide speed improvements similar to the JSON writer (perhaps slightly less).

        No fastwriter optimisation is necessary for PHPSerializedResponseWriter as there is no need to escape strings before they are written.

        Show
        Pieter Berkel added a comment - Sorry I've been a bit slow catching up with this issue. Please find attached a trival patch to PHPResponseWriter.java that takes advantage of the new FastWriter code, it should provide speed improvements similar to the JSON writer (perhaps slightly less). No fastwriter optimisation is necessary for PHPSerializedResponseWriter as there is no need to escape strings before they are written.
        Hide
        Yonik Seeley added a comment -

        Thanks Pieter, I just committed the PHP changes.

        Show
        Yonik Seeley added a comment - Thanks Pieter, I just committed the PHP changes.
        Hide
        Dave Lewis added a comment -

        After this patch, using PHPSerializedResponseWriter returns output that is unreadable by my PHP application. I know that doesn't make any sense, but I'm looking into it now.

        Show
        Dave Lewis added a comment - After this patch, using PHPSerializedResponseWriter returns output that is unreadable by my PHP application. I know that doesn't make any sense, but I'm looking into it now.
        Hide
        Yonik Seeley added a comment -

        What container are you using?
        Jetty used to have a bug where the Writer they return to the servlet had issues with chars > 127 if you used writer.write(string,off,len)

        Show
        Yonik Seeley added a comment - What container are you using? Jetty used to have a bug where the Writer they return to the servlet had issues with chars > 127 if you used writer.write(string,off,len)
        Hide
        Yonik Seeley added a comment -

        FYI, I haven't been able to reproduce any problems along these lines using the Jetty version that's bundled (and I set the FastWriter buffer size artificially low to exercise the boundary handling).

        Show
        Yonik Seeley added a comment - FYI, I haven't been able to reproduce any problems along these lines using the Jetty version that's bundled (and I set the FastWriter buffer size artificially low to exercise the boundary handling).
        Hide
        Yonik Seeley added a comment -

        OK, I think it was a lack of flushing the buffer in the FastWriter.
        I've checked in a patch... can you try with the trunk version?

        Show
        Yonik Seeley added a comment - OK, I think it was a lack of flushing the buffer in the FastWriter. I've checked in a patch... can you try with the trunk version?
        Hide
        Dave Lewis added a comment -

        That appears to have been it, trunk works great! Thanks!

        Show
        Dave Lewis added a comment - That appears to have been it, trunk works great! Thanks!
        Hide
        Hoss Man added a comment -

        This bug was modified as part of a bulk update using the criteria...

        • Marked "Resolved" and "Fixed"
        • Had no "Fix Version" versions
        • Was listed in the CHANGES.txt for 1.3 as of today 2008-03-15

        The Fix Version for all 29 issues found was set to 1.3, email notification was suppressed to prevent excessive email.

        For a list of all the issues modified, search jira comments for this (hopefully) unique string: batch20070315hossman1

        Show
        Hoss Man added a comment - This bug was modified as part of a bulk update using the criteria... Marked "Resolved" and "Fixed" Had no "Fix Version" versions Was listed in the CHANGES.txt for 1.3 as of today 2008-03-15 The Fix Version for all 29 issues found was set to 1.3, email notification was suppressed to prevent excessive email. For a list of all the issues modified, search jira comments for this (hopefully) unique string: batch20070315hossman1

          People

          • Assignee:
            Unassigned
            Reporter:
            Yonik Seeley
          • Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development