Lucene - Core
  1. Lucene - Core
  2. LUCENE-1750

Create a MergePolicy that limits the maximum size of it's segments

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: 2.4.1
    • Fix Version/s: 3.2, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Basically I'm trying to create largish 2-4GB shards using
      LogByteSizeMergePolicy, however I've found in the attached unit
      test segments that exceed maxMergeMB.

      The goal is for segments to be merged up to 2GB, then all
      merging to that segment stops, and then another 2GB segment is
      created. This helps when replicating in Solr where if a single
      optimized 60GB segment is created, the machine stops working due
      to IO and CPU starvation.

      1. LUCENE-1750.patch
        2 kB
        Jason Rutherglen

        Activity

        Hide
        Michael McCandless added a comment -

        TieredMergePolicy does this...

        Show
        Michael McCandless added a comment - TieredMergePolicy does this...
        Hide
        Shai Erera added a comment -

        I think we can refine this to only merge contiguous segments that are sharing doc stores

        So in this case it means that segment A will remain smaller than 4 GB and will never get merged (b/c segments B and C reached their limit)?

        Show
        Shai Erera added a comment - I think we can refine this to only merge contiguous segments that are sharing doc stores So in this case it means that segment A will remain smaller than 4 GB and will never get merged (b/c segments B and C reached their limit)?
        Hide
        Jason Rutherglen added a comment -

        We cannot merge A w/ D, because the doc IDs need to be in
        increasing order and retain the order they were added to the
        index?

        The segments are merged in order because they may be sharing doc
        stores. I think we can refine this to only merge contiguous
        segments that are sharing doc stores, otherwise we can merge
        non-contiguous segments which continues with LUCENE-1076?

        When the shards are in their own directories (which is how Katta
        works), the building process is somewhat easier as we're dealing
        with a separate segmentInfos for each shard. I am not sure how
        Solr would handle an index sharded into multiple directories.

        Show
        Jason Rutherglen added a comment - We cannot merge A w/ D, because the doc IDs need to be in increasing order and retain the order they were added to the index? The segments are merged in order because they may be sharing doc stores. I think we can refine this to only merge contiguous segments that are sharing doc stores, otherwise we can merge non-contiguous segments which continues with LUCENE-1076 ? When the shards are in their own directories (which is how Katta works), the building process is somewhat easier as we're dealing with a separate segmentInfos for each shard. I am not sure how Solr would handle an index sharded into multiple directories.
        Hide
        Shai Erera added a comment -

        we could add an optimize(long maxSegmentSize)

        This I think would be useful anyway, and kind of required if we introduce the proposed merge policy. Otherwise, if someone's code calls optimize (w/ or w/o num segments limit), those large segments will be optimized as well.

        except if it accumulates too many deletes (as a percentage of docs) then it can be compacted and new segments merged into it?

        If one would call expungeDeletes, and that segment will go below the max size, then it will be eligible for merging, right? But I have a question here, and it may be that I'm missing something in the merge process. Say I have the following segments, each at 4 GB (the limit), except D:
        A (docs 0-99), B (docs 100-230), C (docs 231-450) and D (docs 451-470). Then A accumulates 50 deletes. On one hand, we'd want it to be merged, but if we want that, we have to merge B and C either, right? We cannot merge A w/ D, because the doc IDs need to be in increasing order and retain the order they were added to the index?

        So will the merge policy detect that? I think that it should and the way to work around that is to ensure that the first segment which is below the limit, triggers the merge of all following segments (in doc ID order), regardless of their size?

        I don't know if your patch already takes care of this case, and whether my understanding is correct, so if you already handle it that way (or some other way), then that's fine.

        Show
        Shai Erera added a comment - we could add an optimize(long maxSegmentSize) This I think would be useful anyway, and kind of required if we introduce the proposed merge policy. Otherwise, if someone's code calls optimize (w/ or w/o num segments limit), those large segments will be optimized as well. except if it accumulates too many deletes (as a percentage of docs) then it can be compacted and new segments merged into it? If one would call expungeDeletes, and that segment will go below the max size, then it will be eligible for merging, right? But I have a question here, and it may be that I'm missing something in the merge process. Say I have the following segments, each at 4 GB (the limit), except D: A (docs 0-99), B (docs 100-230), C (docs 231-450) and D (docs 451-470). Then A accumulates 50 deletes. On one hand, we'd want it to be merged, but if we want that, we have to merge B and C either, right? We cannot merge A w/ D, because the doc IDs need to be in increasing order and retain the order they were added to the index? So will the merge policy detect that? I think that it should and the way to work around that is to ensure that the first segment which is below the limit, triggers the merge of all following segments (in doc ID order), regardless of their size? I don't know if your patch already takes care of this case, and whether my understanding is correct, so if you already handle it that way (or some other way), then that's fine.
        Hide
        Jason Rutherglen added a comment -

        > Wouldn't you want them to be merged into an even larger
        segment?

        I think once the segment reaches the limit (i.e. 4GB), it's
        effectively done and nothing more happens to it, except if it
        accumulates too many deletes (as a percentage of docs) then it
        can be compacted and new segments merged into it?

        I think first of all, as we reach the capacity of the machine's
        IO and RAM, large segment merges thrash the machine (i.e. the IO
        cache is ruined and must be restored, IO is unavailable for
        searches, further indexing stops), they become too large to pass
        between servers (i.e. Hadoop, Katta, or Solr's replication).

        I'm not sure how much search degrades due to 10-20 larger
        segments as opposed to a single massive 60GB segment? But if
        search is unavailable on a machine due to the CPU and IO
        thrashing (of massive segment merges) it seems like a fair
        tradeoff?

        I think optimize remains as is although I would never call it.
        Or we could add an optimize(long maxSegmentSize) method which is
        analogous to optimize(int maxSegments).

        Show
        Jason Rutherglen added a comment - > Wouldn't you want them to be merged into an even larger segment? I think once the segment reaches the limit (i.e. 4GB), it's effectively done and nothing more happens to it, except if it accumulates too many deletes (as a percentage of docs) then it can be compacted and new segments merged into it? I think first of all, as we reach the capacity of the machine's IO and RAM, large segment merges thrash the machine (i.e. the IO cache is ruined and must be restored, IO is unavailable for searches, further indexing stops), they become too large to pass between servers (i.e. Hadoop, Katta, or Solr's replication). I'm not sure how much search degrades due to 10-20 larger segments as opposed to a single massive 60GB segment? But if search is unavailable on a machine due to the CPU and IO thrashing (of massive segment merges) it seems like a fair tradeoff? I think optimize remains as is although I would never call it. Or we could add an optimize(long maxSegmentSize) method which is analogous to optimize(int maxSegments).
        Hide
        Shai Erera added a comment -

        What happens after several such large segments are created? Wouldn't you want them to be merged into an even larger segment? Or, you'll have many such segments and search performance will degrade.

        I guess I never thought this is a problem. If I have enough disk space, and my index size reaches 600 GB (which is a huge index), and is split across 10 different segments of size 60GB each, I guess I'd want them to be merged into one larger 600GB segment. It will take ions until I'll accumulate another such 600 GB segment, no?

        Maybe we can have two merge factors: 1) for small segments, or up to a set size threshold, where we do the merges regularly. 2) Then, for really large segments we say the marge factor is different. For example, we can say that up to 1GB the merge factor is 10, and beyond the merge factor is 20. That will postpone the large IO merges until enough such segments accumulate.

        Also, w/ the current proposal, how will optimize work? Will it skip the very large segments, or will they be included too?

        Show
        Shai Erera added a comment - What happens after several such large segments are created? Wouldn't you want them to be merged into an even larger segment? Or, you'll have many such segments and search performance will degrade. I guess I never thought this is a problem. If I have enough disk space, and my index size reaches 600 GB (which is a huge index), and is split across 10 different segments of size 60GB each, I guess I'd want them to be merged into one larger 600GB segment. It will take ions until I'll accumulate another such 600 GB segment, no? Maybe we can have two merge factors: 1) for small segments, or up to a set size threshold, where we do the merges regularly. 2) Then, for really large segments we say the marge factor is different. For example, we can say that up to 1GB the merge factor is 10, and beyond the merge factor is 20. That will postpone the large IO merges until enough such segments accumulate. Also, w/ the current proposal, how will optimize work? Will it skip the very large segments, or will they be included too?
        Hide
        Jason Rutherglen added a comment -

        Yeah I realized that later.

        So a new merge policy that inherits from LogByteSizeMergePolicy
        that keeps a segment size limit will work. Ideally once a
        segment reaches a near enough range, segments will stop being
        merged to it. This was easier when the shards were in separate
        directories (i.e. fill up the directory, stop when it's at the
        limit, optimize the directory, and move on).

        Show
        Jason Rutherglen added a comment - Yeah I realized that later. So a new merge policy that inherits from LogByteSizeMergePolicy that keeps a segment size limit will work. Ideally once a segment reaches a near enough range, segments will stop being merged to it. This was easier when the shards were in separate directories (i.e. fill up the directory, stop when it's at the limit, optimize the directory, and move on).
        Hide
        Michael McCandless added a comment -

        maxMergeMB refers to the max size of segments selected for merging, not the max size of the resulting merged segment.

        Show
        Michael McCandless added a comment - maxMergeMB refers to the max size of segments selected for merging, not the max size of the resulting merged segment.
        Hide
        Jason Rutherglen added a comment -

        Unit test illustrating the issue.

        Show
        Jason Rutherglen added a comment - Unit test illustrating the issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jason Rutherglen
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 48h
              48h
              Remaining:
              Remaining Estimate - 48h
              48h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development