Solr
  1. Solr
  2. SOLR-6841

Visualize lucene segment info in admin

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      We find it useful to tune merge policy not blindly but looking on segment size and fill ratio.

      We're working on a patch that adds a tab to admin page with McCandless-style of segment visualization.

      Draft UI is attached (currenly as part of admin.extra).

      Please share your ideas if it's ok to put the code in core admin.

      More details here
      http://search-lucene.com/m/QTPa44cNJ1

      https://plus.google.com/+MichaelMcCandless/posts/MJVueTznYnD
      http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

      1. (old)Overview page.png
        66 kB
        Alexey Kozhemiakin
      2. segments_info_merge_candidates.png
        7 kB
        Michal Bienkowski
      3. segments_info.png
        46 kB
        Michal Bienkowski
      4. SOLR-6841.patch
        26 kB
        Shalin Shekhar Mangar
      5. SOLR-6841.patch
        25 kB
        Michal Bienkowski

        Activity

        Hide
        Alexey Kozhemiakin added a comment - - edited

        Work-in-progress UI screenshot attached to demonstrate the idea

        Show
        Alexey Kozhemiakin added a comment - - edited Work-in-progress UI screenshot attached to demonstrate the idea
        Hide
        Shalin Shekhar Mangar added a comment -

        Looks great! Thanks Alexey.

        Please share your ideas if it's ok to put the code in core admin.

        It depends on the code. Let's see the patch first and then discuss.

        Show
        Shalin Shekhar Mangar added a comment - Looks great! Thanks Alexey. Please share your ideas if it's ok to put the code in core admin. It depends on the code. Let's see the patch first and then discuss.
        Hide
        Hoss Man added a comment -

        very cool - would definitely welcome that as a patch to integrate directly into the solr admin.

        I suspect that auto generating the "size" scale in such a way that the numbers don't look silly is a hard problem – particularly since it looks like you are trying to use a log scale as well. maybe just label the "max" (ie: size of biggest segment) on the horizontal axis, along with the vertical log lines, and then label each segment with it's size (or use tool tips) ?

        Show
        Hoss Man added a comment - very cool - would definitely welcome that as a patch to integrate directly into the solr admin. I suspect that auto generating the "size" scale in such a way that the numbers don't look silly is a hard problem – particularly since it looks like you are trying to use a log scale as well. maybe just label the "max" (ie: size of biggest segment) on the horizontal axis, along with the vertical log lines, and then label each segment with it's size (or use tool tips) ?
        Hide
        Shawn Heisey added a comment -

        That looks really nice. Thank you for working on this!

        Hoss' idea for tooltips sounds good to me. If all the segment detail is on a tooltip, the UI will be very clean.

        Here's a series of additional ideas: It would be really nice to be able to change the sort on the bars. The two primary choices that spring to mind are segment size and the numeric segment number. Some might even want to sort by maxdoc and/or deleted documents. If it's not super-difficult, a sort option that groups the bars by "these segments are the current candidates for the next merge" would satisfy the original requirements nicely.

        Show
        Shawn Heisey added a comment - That looks really nice. Thank you for working on this! Hoss' idea for tooltips sounds good to me. If all the segment detail is on a tooltip, the UI will be very clean. Here's a series of additional ideas: It would be really nice to be able to change the sort on the bars. The two primary choices that spring to mind are segment size and the numeric segment number. Some might even want to sort by maxdoc and/or deleted documents. If it's not super-difficult, a sort option that groups the bars by "these segments are the current candidates for the next merge" would satisfy the original requirements nicely.
        Hide
        Michal Bienkowski added a comment - - edited

        Lucene segment info implementation as a separate menu entry in Solr Admin and separate handler registered as Luke handler in PluginRegister. See attachment.

        Show
        Michal Bienkowski added a comment - - edited Lucene segment info implementation as a separate menu entry in Solr Admin and separate handler registered as Luke handler in PluginRegister. See attachment.
        Hide
        Michal Bienkowski added a comment -

        Segments info with tooltip - screenshot. See attachment.

        Show
        Michal Bienkowski added a comment - Segments info with tooltip - screenshot. See attachment.
        Hide
        Alexey Kozhemiakin added a comment -

        Michal, thanks for uploading the patch.

        Community, please share your feedback, most comments were implemented (tooltips, cleaned-up scale and numbers formatting).

        If current solution is ok from architecture point of view (separate sub handler to luke handler) then we will proceed with sorting options (by age, size) and highlighting (these segments are candidates for next merge - MergePolicy.findMerges(...))

        Show
        Alexey Kozhemiakin added a comment - Michal, thanks for uploading the patch. Community, please share your feedback, most comments were implemented (tooltips, cleaned-up scale and numbers formatting). If current solution is ok from architecture point of view (separate sub handler to luke handler) then we will proceed with sorting options (by age, size) and highlighting (these segments are candidates for next merge - MergePolicy.findMerges(...))
        Hide
        Shalin Shekhar Mangar added a comment -

        Thanks Michal and Alexey.

        The solution looks fine. The separate request handler is going to be an internal detail and we can always change that part. What was the reason behind moving this to a separate page by itself instead of keeping it in the "Overview" admin-extras area?

        Show
        Shalin Shekhar Mangar added a comment - Thanks Michal and Alexey. The solution looks fine. The separate request handler is going to be an internal detail and we can always change that part. What was the reason behind moving this to a separate page by itself instead of keeping it in the "Overview" admin-extras area?
        Hide
        Alexey Kozhemiakin added a comment -

        Here's what we considered:
        1) admin-extra is usually a placeholder for custom code\markup and comes empty in default packages, definitely not a good place to land our code
        2) if you meant just the placement of bars to where admin-extra is now rendered - usually # of segments is 30-50 and it will make Overview page annoyingly long and noisy even if we place it (attached a screenshot once again)
        3) that's why we came with separate page keeping in mind that at some point it would host more low level index details.

        Show
        Alexey Kozhemiakin added a comment - Here's what we considered: 1) admin-extra is usually a placeholder for custom code\markup and comes empty in default packages, definitely not a good place to land our code 2) if you meant just the placement of bars to where admin-extra is now rendered - usually # of segments is 30-50 and it will make Overview page annoyingly long and noisy even if we place it (attached a screenshot once again) 3) that's why we came with separate page keeping in mind that at some point it would host more low level index details.
        Hide
        Alexey Kozhemiakin added a comment - - edited

        As for "candidates for merge" - afaik findMerges will return segments that are ready to be merged according to current configuration and we will return nothing until that moment. So essentially it will start returning non-empty "candidates" when a merge will be already going in background - I will double check if this is true tomorrow. And if yes - then it will be "ongoing merge" section rather than "candidates".

        Show
        Alexey Kozhemiakin added a comment - - edited As for "candidates for merge" - afaik findMerges will return segments that are ready to be merged according to current configuration and we will return nothing until that moment. So essentially it will start returning non-empty "candidates" when a merge will be already going in background - I will double check if this is true tomorrow. And if yes - then it will be "ongoing merge" section rather than "candidates".
        Hide
        Michal Bienkowski added a comment -

        Patch with merge candidates highlighting feature. Retrieves merge candidates from MergePolicy.findMerges and apply this information to matched segments.

        Show
        Michal Bienkowski added a comment - Patch with merge candidates highlighting feature. Retrieves merge candidates from MergePolicy.findMerges and apply this information to matched segments.
        Hide
        Varun Thacker added a comment -

        This is cool!

        While playing around I observed that when there are no segments in the index, the Segments Info page says:
        Size 0
        Deletions: NaN%

        In the future it would be awesome if we could click on a segment we could get more granular information about it like the number of fields in the segment, number of terms in that field, memory consumed by the segment etc.

        Show
        Varun Thacker added a comment - This is cool! While playing around I observed that when there are no segments in the index, the Segments Info page says: Size 0 Deletions: NaN% In the future it would be awesome if we could click on a segment we could get more granular information about it like the number of fields in the segment, number of terms in that field, memory consumed by the segment etc.
        Hide
        Shalin Shekhar Mangar added a comment -

        Patch to bring this in sync with trunk. Changes:

        1. Implicit handlers are now registered in ImplicitPlugins in trunk
        2. LUCENE-6307 renamed SegmentInfo.getDocCount -> .maxDoc
        3. Added check for document count == 0 before calculating deletion %
        4. IntelliJ was complaining about unterminated javascript statements so I added semi-colon characters

        I'll commit this shortly.

        Show
        Shalin Shekhar Mangar added a comment - Patch to bring this in sync with trunk. Changes: Implicit handlers are now registered in ImplicitPlugins in trunk LUCENE-6307 renamed SegmentInfo.getDocCount -> .maxDoc Added check for document count == 0 before calculating deletion % IntelliJ was complaining about unterminated javascript statements so I added semi-colon characters I'll commit this shortly.
        Hide
        ASF subversion and git services added a comment -

        Commit 1665105 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1665105 ]

        SOLR-6841: Visualize lucene segment information in Admin UI

        Show
        ASF subversion and git services added a comment - Commit 1665105 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1665105 ] SOLR-6841 : Visualize lucene segment information in Admin UI
        Hide
        Shalin Shekhar Mangar added a comment -

        Thanks Alexey and Michal for the work and everyone else for the reviews!

        This will be released with 5.1. Let's open new issues for any further enhancements (sorting etc.) that may be required.

        Show
        Shalin Shekhar Mangar added a comment - Thanks Alexey and Michal for the work and everyone else for the reviews! This will be released with 5.1. Let's open new issues for any further enhancements (sorting etc.) that may be required.
        Hide
        ASF subversion and git services added a comment -

        Commit 1665106 from shalin@apache.org in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1665106 ]

        SOLR-6841: Visualize lucene segment information in Admin UI

        Show
        ASF subversion and git services added a comment - Commit 1665106 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1665106 ] SOLR-6841 : Visualize lucene segment information in Admin UI
        Hide
        Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        Timothy Potter added a comment - Bulk close after 5.1 release

          People

          • Assignee:
            Shalin Shekhar Mangar
            Reporter:
            Alexey Kozhemiakin
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development