Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1340

Make it posible not to include TF information in index

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 2.4
    • core/index
    • None
    • New, Patch Available

    Description

      Term Frequency is typically not needed for all fields, some CPU (reading one VInt less and one X>>>1...) and IO can be spared by making pure boolen fields possible in Lucene. This topic has already been discussed and accepted as a part of Flexible Indexing... This issue tries to push things a bit faster forward as I have some concrete customer demands.

      benefits can be expected for fields that are typical candidates for Filters, enumerations, user rights, IDs or very short "texts", phone numbers, zip codes, names...

      Status: just passed standard test (compatibility), commited for early review, I have not tried new feature, missing some asserts and one two unit tests

      Complexity: simpler than expected

      can be used via omitTf() (who used omitNorms() will know where to find it

      Attachments

        1. LUCENE-1340.patch
          20 kB
          Eks Dev
        2. LUCENE-1340.patch
          20 kB
          Michael McCandless
        3. LUCENE-1340.patch
          32 kB
          Eks Dev
        4. LUCENE-1340.patch
          34 kB
          Eks Dev
        5. LUCENE-1340.patch
          58 kB
          Michael McCandless
        6. LUCENE-1340.patch
          63 kB
          Michael McCandless
        7. LUCENE-1340.patch
          152 kB
          Michael McCandless

        Issue Links

          Activity

            People

              mikemccand Michael McCandless
              eksdev Eks Dev
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 24h
                  24h
                  Remaining:
                  Remaining Estimate - 24h
                  24h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified