Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: master (7.0), 6.2
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      There are improvements we can make on top of LUCENE-7396 to get ever better flush performance.

      1. LUCENE-7399.patch
        43 kB
        Adrien Grand
      2. LUCENE-7399.patch
        40 kB
        Adrien Grand

        Activity

        Hide
        jpountz Adrien Grand added a comment -

        Here is a patch, it contains two main improvements:

        • MutablePointsWriter gets packed values from the ByteBlockPool through a BytesRef rather than copying, this saves a lot of memcpy which seems to improve flush time by about 14% on the IndexTaxis benchmark
        • Radix sort/select gets the ability to detect prefixes that are longer than one byte in a single pass, which seems to help by another 11%, probably by being more cache efficient.

        Overall the flush time of IndexTaxis with a 1GB buffer went down from 19.1 secs to 14.6 secs on my machine (-24%).

        Show
        jpountz Adrien Grand added a comment - Here is a patch, it contains two main improvements: MutablePointsWriter gets packed values from the ByteBlockPool through a BytesRef rather than copying, this saves a lot of memcpy which seems to improve flush time by about 14% on the IndexTaxis benchmark Radix sort/select gets the ability to detect prefixes that are longer than one byte in a single pass, which seems to help by another 11%, probably by being more cache efficient. Overall the flush time of IndexTaxis with a 1GB buffer went down from 19.1 secs to 14.6 secs on my machine (-24%).
        Hide
        mikemccand Michael McCandless added a comment -

        I did before/after test indexing 1.2 B NYC taxi rides with luceneserver (https://github.com/mikemccand/luceneserver) and this patched improved total indexing throughput from 301.9 K docs/sec to 327.6 K docs/sec!

        I'll review the patch ... the changes sound great.

        Show
        mikemccand Michael McCandless added a comment - I did before/after test indexing 1.2 B NYC taxi rides with luceneserver ( https://github.com/mikemccand/luceneserver ) and this patched improved total indexing throughput from 301.9 K docs/sec to 327.6 K docs/sec! I'll review the patch ... the changes sound great.
        Hide
        mikemccand Michael McCandless added a comment -

        Maybe the visitor should also take BytesRef? Codec impls could
        read a whole byte[] values block in at once; maybe that's a savings.
        We can explore that separately in a v3 issue! We could also fix
        BKDWriter.writeCommonPrefixes to save the copy there, though
        that's just once per leaf block.

        Shouldn't the assertHistogram call be called in an assert? It
        seems to be called directly now.

        It looks like you also removed Sorter.insertionSort in favor of
        binarySort. Maybe add a javadoc to Sorter.binarySort saying
        it's O(N^2) and is only used once we have recursed down to <= 20 items
        to sort?

        Have you tweaked 20 to see if that's a good value? Sorting BKD points
        is rather costly since when we swap, we swap whole values (docID,
        maybe ord, then the byte[] value for this field).

        Show
        mikemccand Michael McCandless added a comment - Maybe the visitor should also take BytesRef ? Codec impls could read a whole byte[] values block in at once; maybe that's a savings. We can explore that separately in a v3 issue! We could also fix BKDWriter.writeCommonPrefixes to save the copy there, though that's just once per leaf block. Shouldn't the assertHistogram call be called in an assert ? It seems to be called directly now. It looks like you also removed Sorter.insertionSort in favor of binarySort . Maybe add a javadoc to Sorter.binarySort saying it's O(N^2) and is only used once we have recursed down to <= 20 items to sort? Have you tweaked 20 to see if that's a good value? Sorting BKD points is rather costly since when we swap, we swap whole values (docID, maybe ord, then the byte[] value for this field).
        Hide
        jpountz Adrien Grand added a comment -

        Here is a new patch, I fixed assertHistogram to be called in an assertion and added the suggested docs.

        Maybe the visitor should also take BytesRef? Codec impls could read a whole byte[] values block in at once

        I am not sure codecs could leverage this. I think a serious codec impl would do prefix compression to save space, so it could not read large byte[] anyway as it would need to concatenate the shared prefix and the suffix that is specific to the value at every iteration?

        We could also fix BKDWriter.writeCommonPrefixes to save the copy there, though that's just once per leaf block.

        I remember trying it out and it didn't help.

        Have you tweaked 20 to see if that's a good value? Sorting BKD points is rather costly since when we swap, we swap whole values (docID, maybe ord, then the byte[] value for this field).

        I remember tweaking it a long time ago when I worked in this Sorter abstraction, and values in [20,50] looked fine when sorting a simple int[] (so both comparisons and swaps were cheap) so I picked 20 to err on the safe side. It's true it might be different with points that have costly swaps.

        Show
        jpountz Adrien Grand added a comment - Here is a new patch, I fixed assertHistogram to be called in an assertion and added the suggested docs. Maybe the visitor should also take BytesRef? Codec impls could read a whole byte[] values block in at once I am not sure codecs could leverage this. I think a serious codec impl would do prefix compression to save space, so it could not read large byte[] anyway as it would need to concatenate the shared prefix and the suffix that is specific to the value at every iteration? We could also fix BKDWriter.writeCommonPrefixes to save the copy there, though that's just once per leaf block. I remember trying it out and it didn't help. Have you tweaked 20 to see if that's a good value? Sorting BKD points is rather costly since when we swap, we swap whole values (docID, maybe ord, then the byte[] value for this field). I remember tweaking it a long time ago when I worked in this Sorter abstraction, and values in [20,50] looked fine when sorting a simple int[] (so both comparisons and swaps were cheap) so I picked 20 to err on the safe side. It's true it might be different with points that have costly swaps.
        Hide
        mikemccand Michael McCandless added a comment -

        +1

        Took me a bit to understand the histograms!

        Show
        mikemccand Michael McCandless added a comment - +1 Took me a bit to understand the histograms!
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 33197ec98afbd3be8f9bf48aa4556fcbfab3f37e in lucene-solr's branch refs/heads/branch_6x from Adrien Grand
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=33197ec ]

        LUCENE-7399: Speed up flush of points.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 33197ec98afbd3be8f9bf48aa4556fcbfab3f37e in lucene-solr's branch refs/heads/branch_6x from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=33197ec ] LUCENE-7399 : Speed up flush of points.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 9fc4624853d170b5982a0d0c9d9e76a477a2b713 in lucene-solr's branch refs/heads/master from Adrien Grand
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9fc4624 ]

        LUCENE-7399: Speed up flush of points.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 9fc4624853d170b5982a0d0c9d9e76a477a2b713 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9fc4624 ] LUCENE-7399 : Speed up flush of points.
        Hide
        jpountz Adrien Grand added a comment -

        Thanks Mike.

        Show
        jpountz Adrien Grand added a comment - Thanks Mike.
        Hide
        mikemccand Michael McCandless added a comment -

        Bulk close resolved issues after 6.2.0 release.

        Show
        mikemccand Michael McCandless added a comment - Bulk close resolved issues after 6.2.0 release.

          People

          • Assignee:
            jpountz Adrien Grand
            Reporter:
            jpountz Adrien Grand
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development