Lucene - Core
  1. Lucene - Core
  2. LUCENE-2832

on Windows 64-bit, maybe we should default to a better maxBBufSize in MMapDirectory

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, Trunk
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Currently the default max buffer size for MMapDirectory is 256MB on 32bit and Integer.MAX_VALUE on 64bit:

      public static final int DEFAULT_MAX_BUFF = Constants.JRE_IS_64BIT ? Integer.MAX_VALUE : (256 * 1024 * 1024);
      

      But, in windows on 64-bit, you are practically limited to 8TB. This can cause problems in extreme cases, such as: http://www.lucidimagination.com/search/document/7522ee54c46f9ca4/map_failed_at_getsearcher

      Perhaps it would be good to change this default such that its 256MB on 32Bit OR windows, but leave it at Integer.MAX_VALUE
      on other 64-bit and "64-bit" (48-bit) systems.

        Activity

        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Robert Muir added a comment -

        I am removing 3.1 as I think its the safest option.

        We can revisit if someone is willing to test parameters on enormous indexes (200GB, 500GB, 1TB, ...)
        otherwise we are just guessing.

        Show
        Robert Muir added a comment - I am removing 3.1 as I think its the safest option. We can revisit if someone is willing to test parameters on enormous indexes (200GB, 500GB, 1TB, ...) otherwise we are just guessing.
        Hide
        Robert Muir added a comment -

        In this case, its very extreme. the user had 1.1 billion documents on one windows server.

        I am not sure if this issue will even help anyone at all: will a smaller buffer really help fragmentation in these cases?
        The user never responded to my suggestion to change the buffer size.

        I think a good option here is to do nothing at all, but I'm not opposed to reducing the buffer if it will actually help,
        mainly because the MultiMMapIndexInput is sped up and it shouldn't cause as much slowdown as before.

        Show
        Robert Muir added a comment - In this case, its very extreme. the user had 1.1 billion documents on one windows server. I am not sure if this issue will even help anyone at all: will a smaller buffer really help fragmentation in these cases? The user never responded to my suggestion to change the buffer size. I think a good option here is to do nothing at all, but I'm not opposed to reducing the buffer if it will actually help, mainly because the MultiMMapIndexInput is sped up and it shouldn't cause as much slowdown as before.
        Hide
        Uwe Schindler added a comment -

        Sorry my last comment was stupid, as 1/8 of 8TB is still larger as Integer.MAX_VALUE (I was thinking of Long.MAX_VALUE).

        I still have no idea why this fails, as 8 TB of address space should be enough for thousands of 2 GB blocks.

        Show
        Uwe Schindler added a comment - Sorry my last comment was stupid, as 1/8 of 8TB is still larger as Integer.MAX_VALUE (I was thinking of Long.MAX_VALUE). I still have no idea why this fails, as 8 TB of address space should be enough for thousands of 2 GB blocks.
        Hide
        Uwe Schindler added a comment -

        I would suggest to use a different default for Win64, as the adress space is not as small as with 32 bit. How about something like 4 GB or 16 GB?

        Also, for 32bit we use 1/8 of possible address space, so why not the same (1/8) for win64?

        Show
        Uwe Schindler added a comment - I would suggest to use a different default for Win64, as the adress space is not as small as with 32 bit. How about something like 4 GB or 16 GB? Also, for 32bit we use 1/8 of possible address space, so why not the same (1/8) for win64?

          People

          • Assignee:
            Robert Muir
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development