Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.2, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Make DocSet implement Accountable so that we can correctly estimate memory usage.

      1. SOLR-7371.patch
        6 kB
        Yonik Seeley
      2. SOLR-7371.patch
        10 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -
          1. DocSet implements Accountable
          2. Corresponding changes in all sub-classes viz. BitDocSet, DocSlice, HashDocSet and SortedIntDocSet

          Are we worried about back-compat here? This is definitely expert level API but I can see some custom search components using it. Maybe I should make DocSetBase implement Accountable instead?

          Show
          Shalin Shekhar Mangar added a comment - DocSet implements Accountable Corresponding changes in all sub-classes viz. BitDocSet, DocSlice, HashDocSet and SortedIntDocSet Are we worried about back-compat here? This is definitely expert level API but I can see some custom search components using it. Maybe I should make DocSetBase implement Accountable instead?
          Hide
          Yonik Seeley added a comment -

          I think it's fine if DocSet implements Accountable - I doubt anyone has their own DocSet implementations.

          Show
          Yonik Seeley added a comment - I think it's fine if DocSet implements Accountable - I doubt anyone has their own DocSet implementations.
          Hide
          Shalin Shekhar Mangar added a comment -

          Okay, thanks Yonik. I'll commit this shortly.

          Show
          Shalin Shekhar Mangar added a comment - Okay, thanks Yonik. I'll commit this shortly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1672391 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1672391 ]

          SOLR-7371: Make DocSet implement Accountable to estimate memory usage

          Show
          ASF subversion and git services added a comment - Commit 1672391 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1672391 ] SOLR-7371 : Make DocSet implement Accountable to estimate memory usage
          Hide
          ASF subversion and git services added a comment -

          Commit 1672392 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1672392 ]

          SOLR-7371: Make DocSet implement Accountable to estimate memory usage

          Show
          ASF subversion and git services added a comment - Commit 1672392 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1672392 ] SOLR-7371 : Make DocSet implement Accountable to estimate memory usage
          Hide
          Mark Miller added a comment -

          I'm seeing NPE's in lots of the tests after this change.

          Show
          Mark Miller added a comment - I'm seeing NPE's in lots of the tests after this change.
          Hide
          Shalin Shekhar Mangar added a comment -

          Shit. I'll fix.

          Show
          Shalin Shekhar Mangar added a comment - Shit. I'll fix.
          Hide
          ASF subversion and git services added a comment -

          Commit 1672402 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1672402 ]

          SOLR-7371: Fix NPE when scores are null

          Show
          ASF subversion and git services added a comment - Commit 1672402 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1672402 ] SOLR-7371 : Fix NPE when scores are null
          Hide
          Yonik Seeley added a comment -

          I can fix if Shalin isn't around.
          I also want to change up some things... I don't think we should pre-calculate the ram size (thus increasing the size of smaller DocSet objects. I also think we shouldn't bother with the slower alignment checks. It doesn't pay to worry about alignment (0 to 7 bytes off?) when we're dealing with bitsets of size maxDoc. It's completely in the noise and will be much smaller than stuff we totally ignore like cache keys and the datastructures in the HashMap (for example) itself.

          Show
          Yonik Seeley added a comment - I can fix if Shalin isn't around. I also want to change up some things... I don't think we should pre-calculate the ram size (thus increasing the size of smaller DocSet objects. I also think we shouldn't bother with the slower alignment checks. It doesn't pay to worry about alignment (0 to 7 bytes off?) when we're dealing with bitsets of size maxDoc. It's completely in the noise and will be much smaller than stuff we totally ignore like cache keys and the datastructures in the HashMap (for example) itself.
          Hide
          ASF subversion and git services added a comment -

          Commit 1672403 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1672403 ]

          SOLR-7371: Fix NPE when scores are null

          Show
          ASF subversion and git services added a comment - Commit 1672403 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1672403 ] SOLR-7371 : Fix NPE when scores are null
          Hide
          Shalin Shekhar Mangar added a comment -

          I fixed the NPE. Sorry for the noise guys.

          I don't think we should pre-calculate the ram size (thus increasing the size of smaller DocSet objects

          These are immutable objects, right? The ram usage will be checked on each 'put' of a cache (see SOLR-7372) so I think it makes sense to just pre-calculate? In fact, I am surprised that FixedBitSet doesn't pre-calculate the ram usage.

          Show
          Shalin Shekhar Mangar added a comment - I fixed the NPE. Sorry for the noise guys. I don't think we should pre-calculate the ram size (thus increasing the size of smaller DocSet objects These are immutable objects, right? The ram usage will be checked on each 'put' of a cache (see SOLR-7372 ) so I think it makes sense to just pre-calculate? In fact, I am surprised that FixedBitSet doesn't pre-calculate the ram usage.
          Hide
          Yonik Seeley added a comment -

          Proposed patch.

          Show
          Yonik Seeley added a comment - Proposed patch.
          Hide
          Yonik Seeley added a comment -

          We create a lot of these objects that do not go into any cache... probably more that don't than do.
          And if we keep things really fast (and avoid slow modulo arithmetic), we should be fine? For example the original code (memSize) did a single cycle operation... it would not have been faster to cache.

          Show
          Yonik Seeley added a comment - We create a lot of these objects that do not go into any cache... probably more that don't than do. And if we keep things really fast (and avoid slow modulo arithmetic), we should be fine? For example the original code (memSize) did a single cycle operation... it would not have been faster to cache.
          Hide
          Shalin Shekhar Mangar added a comment -

          Okay, that makes sense.

          Your patch has the following in BitDocSet, is this a copy/paste mistake?

          -  private static final long BASE_RAM_BYTES_USED = RamUsageEstimator.shallowSizeOfInstance(BitDocSet.class);
          +  private static final long BASE_RAM_BYTES_USED = RamUsageEstimator.shallowSizeOfInstance(SortedIntDocSet.class) + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER;
          
          Show
          Shalin Shekhar Mangar added a comment - Okay, that makes sense. Your patch has the following in BitDocSet, is this a copy/paste mistake? - private static final long BASE_RAM_BYTES_USED = RamUsageEstimator.shallowSizeOfInstance(BitDocSet.class); + private static final long BASE_RAM_BYTES_USED = RamUsageEstimator.shallowSizeOfInstance(SortedIntDocSet.class) + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER;
          Hide
          ASF subversion and git services added a comment -

          Commit 1672421 from Yonik Seeley in branch 'dev/trunk'
          [ https://svn.apache.org/r1672421 ]

          SOLR-7371: don't cache size for now and tolerate small alignment errors

          Show
          ASF subversion and git services added a comment - Commit 1672421 from Yonik Seeley in branch 'dev/trunk' [ https://svn.apache.org/r1672421 ] SOLR-7371 : don't cache size for now and tolerate small alignment errors
          Hide
          ASF subversion and git services added a comment -

          Commit 1672424 from Yonik Seeley in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1672424 ]

          SOLR-7371: don't cache size for now and tolerate small alignment errors

          Show
          ASF subversion and git services added a comment - Commit 1672424 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1672424 ] SOLR-7371 : don't cache size for now and tolerate small alignment errors
          Hide
          ASF subversion and git services added a comment -

          Commit 1672425 from Yonik Seeley in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1672425 ]

          SOLR-7371: don't cache size for now and tolerate small alignment errors

          Show
          ASF subversion and git services added a comment - Commit 1672425 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1672425 ] SOLR-7371 : don't cache size for now and tolerate small alignment errors
          Hide
          Shalin Shekhar Mangar added a comment -

          Thank you for fixing the base calculations, Yonik.

          One question for my own understanding – the shallowSizeOfInstance method will add an object reference size for any array instance but then we add an array header size which includes an object reference size too. So we are adding an extra object reference size, right? Though that's a small over-calculation compared to the size of objects we will be measuring so it's not a big deal.

          Show
          Shalin Shekhar Mangar added a comment - Thank you for fixing the base calculations, Yonik. One question for my own understanding – the shallowSizeOfInstance method will add an object reference size for any array instance but then we add an array header size which includes an object reference size too. So we are adding an extra object reference size, right? Though that's a small over-calculation compared to the size of objects we will be measuring so it's not a big deal.
          Hide
          Yonik Seeley added a comment -

          then we add an array header size which includes an object reference size too.

          I don't think it does?

                NUM_BYTES_ARRAY_HEADER = (int) alignObjectSize(NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT);
          

          So the array header is just a specialized object header since an array also contains a length.

          Show
          Yonik Seeley added a comment - then we add an array header size which includes an object reference size too. I don't think it does? NUM_BYTES_ARRAY_HEADER = ( int ) alignObjectSize(NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT); So the array header is just a specialized object header since an array also contains a length.
          Hide
          Shalin Shekhar Mangar added a comment -

          Sorry, you're right. I got confused between the NUM_BYTES_OBJECT_HEADER and NUM_BYTES_OBJECT_REF.

          Show
          Shalin Shekhar Mangar added a comment - Sorry, you're right. I got confused between the NUM_BYTES_OBJECT_HEADER and NUM_BYTES_OBJECT_REF.
          Hide
          Shalin Shekhar Mangar added a comment -

          This was released in 5.2

          Show
          Shalin Shekhar Mangar added a comment - This was released in 5.2

            People

            • Assignee:
              Yonik Seeley
              Reporter:
              Shalin Shekhar Mangar
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development