Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8321

Allow composite readers to have more than 2B documents

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      I would like to start discussing removing the limit of ~2B documents that we have for indices, while still enforcing it at the segment level for practical reasons.

      Postings, stored fields, and all other codec APIs would keep working on integers to represent doc ids. Only top-level doc ids and numbers of documents would need to move to a long. I say "only" because we now mostly consume indices per-segment, but there is still a number of places where we identify documents by their top-level doc ID like IndexReader#document, top-docs collectors, etc.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jpountz Adrien Grand
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: