Lucene - Core
  1. Lucene - Core
  2. LUCENE-5337

Add Payload support to FileDictionary (Suggest) and make it more configurable

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 6.0
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      It would be nice to add payload support to FileDictionary, so user can pass in associated payload with suggestion entries.
      Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would be nice to let the users configure the field delimiter as well.

      1. LUCENE-5337.patch
        15 kB
        Erick Erickson
      2. LUCENE-5337.patch
        15 kB
        Areek Zillur
      3. LUCENE-5337.patch
        15 kB
        Areek Zillur

        Activity

        Hide
        Areek Zillur added a comment -

        Initial Patch:

        • added payload support for FileDictionary
        • Improved javadocs
        • made field delimiter configurable
        • added tests for FileDictionary
        Show
        Areek Zillur added a comment - Initial Patch: added payload support for FileDictionary Improved javadocs made field delimiter configurable added tests for FileDictionary
        Hide
        Erick Erickson added a comment -

        Areek:

        There are a couple of problems with this patch...

        1> It won't compile in 4x since it uses some Java 7 constructs,
        I stopped at the "diamond" bit. Unless this is intended for trunk only, could you fix these?

        2> on trunk, running "ant precommit" shows the following errors. These are pretty easy to fix, just takes specifying the UTF-8 charset as I remember. They're all in the test code, but still....

        [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.7
        [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.7
        [forbidden-apis] Reading API signatures: /Users/Erick/apache/trunk_5337/lucene/tools/forbiddenApis/base.txt
        [forbidden-apis] Loading classes to check...
        [forbidden-apis] Scanning for API signatures and dependencies...
        [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset]
        [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:76)
        [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset]
        [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:98)
        [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset]
        [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:120)
        [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset]
        [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:146)
        [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset]
        [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:173)
        [forbidden-apis] Scanned 179 (and 405 related) class file(s) for forbidden API invocations (in 0.10s), 5 error(s).

        I can take care of the secretarial stuff here and get this committed. I glanced over the code but don't know the area deeply to make any deeper comments, anyone want to chime in on that score?

        Show
        Erick Erickson added a comment - Areek: There are a couple of problems with this patch... 1> It won't compile in 4x since it uses some Java 7 constructs, I stopped at the "diamond" bit. Unless this is intended for trunk only, could you fix these? 2> on trunk, running "ant precommit" shows the following errors. These are pretty easy to fix, just takes specifying the UTF-8 charset as I remember. They're all in the test code, but still.... [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.7 [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.7 [forbidden-apis] Reading API signatures: /Users/Erick/apache/trunk_5337/lucene/tools/forbiddenApis/base.txt [forbidden-apis] Loading classes to check... [forbidden-apis] Scanning for API signatures and dependencies... [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset] [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:76) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset] [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:98) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset] [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:120) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset] [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:146) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses default charset] [forbidden-apis] in org.apache.lucene.search.suggest.FileDictionaryTest (FileDictionaryTest.java:173) [forbidden-apis] Scanned 179 (and 405 related) class file(s) for forbidden API invocations (in 0.10s), 5 error(s). I can take care of the secretarial stuff here and get this committed. I glanced over the code but don't know the area deeply to make any deeper comments, anyone want to chime in on that score?
        Hide
        Robert Muir added a comment -

        1> It won't compile in 4x since it uses some Java 7 constructs,
        I stopped at the "diamond" bit. Unless this is intended for trunk only, could you fix these?

        Erick: FYI trunk is on java7. So java7 syntax is actually welcome there and good to use. Its the committers job to remove such syntax when/if backporting to some branch that doesnt support java7.

        Show
        Robert Muir added a comment - 1> It won't compile in 4x since it uses some Java 7 constructs, I stopped at the "diamond" bit. Unless this is intended for trunk only, could you fix these? Erick: FYI trunk is on java7. So java7 syntax is actually welcome there and good to use. Its the committers job to remove such syntax when/if backporting to some branch that doesnt support java7.
        Hide
        Erick Erickson added a comment -

        Robert:

        OK, I didn't realize that was the case, I was guessing that there'd be some
        kind of cutover point, probably where we decided to move development pretty
        much to trunk and start backporting fewer JIRAs...

        That said, things like diamonds etc. are trivial, I'm quite willing to do
        those. More complicated things and I'll be willing to do on a case-by-case
        basis, depending probably on how adventurous I'm feeling at the time and
        how complex the code rearrangement looks.

        Erick

        Show
        Erick Erickson added a comment - Robert: OK, I didn't realize that was the case, I was guessing that there'd be some kind of cutover point, probably where we decided to move development pretty much to trunk and start backporting fewer JIRAs... That said, things like diamonds etc. are trivial, I'm quite willing to do those. More complicated things and I'll be willing to do on a case-by-case basis, depending probably on how adventurous I'm feeling at the time and how complex the code rearrangement looks. Erick
        Hide
        Areek Zillur added a comment -

        Updated Patch:

        • minor changes to fix forbidden api checks and documentation lint

        Thanks Erick and Robert for the review. I updated the patch so that it will pass the validations. I still have the diamond operators in the test code, let me know if there is anything I can do about that.

        Show
        Areek Zillur added a comment - Updated Patch: minor changes to fix forbidden api checks and documentation lint Thanks Erick and Robert for the review. I updated the patch so that it will pass the validations. I still have the diamond operators in the test code, let me know if there is anything I can do about that.
        Hide
        Erick Erickson added a comment -

        Comment-only change so precommit succeeds (Javadoc).

        So this cleanly precommits and tests on my machine. Unless there are objections, I'll commit this Wednesday or so, after 4.6 is tagged, I don't see a good reason to rush this into the 4.6 release.

        Show
        Erick Erickson added a comment - Comment-only change so precommit succeeds (Javadoc). So this cleanly precommits and tests on my machine. Unless there are objections, I'll commit this Wednesday or so, after 4.6 is tagged, I don't see a good reason to rush this into the 4.6 release.
        Hide
        ASF subversion and git services added a comment -

        Commit 1541341 from Erick Erickson in branch 'dev/trunk'
        [ https://svn.apache.org/r1541341 ]

        LUCENE-5337 Add Payload support to FileDictionary (Suggest) and make it more configurable

        Show
        ASF subversion and git services added a comment - Commit 1541341 from Erick Erickson in branch 'dev/trunk' [ https://svn.apache.org/r1541341 ] LUCENE-5337 Add Payload support to FileDictionary (Suggest) and make it more configurable
        Hide
        Robert Muir added a comment -

        Does this one really go in 4.6.0? Seems like this was just a mistake and it should be 4.7.0?

        Note: I created 4.7.0 section of CHANGES already and branch_4x is set for 4.7.0. So i think the CHANGES entry just needs to be moved.

        Show
        Robert Muir added a comment - Does this one really go in 4.6.0? Seems like this was just a mistake and it should be 4.7.0? Note: I created 4.7.0 section of CHANGES already and branch_4x is set for 4.7.0. So i think the CHANGES entry just needs to be moved.
        Hide
        ASF subversion and git services added a comment -

        Commit 1541360 from Erick Erickson in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1541360 ]

        LUCENE-5337 Add Payload support to FileDictionary (Suggest) and make it more configurable

        Show
        ASF subversion and git services added a comment - Commit 1541360 from Erick Erickson in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1541360 ] LUCENE-5337 Add Payload support to FileDictionary (Suggest) and make it more configurable
        Hide
        Erick Erickson added a comment -

        Robert:

        You're right, of course. Thanks!

        I'll fix momentarily.

        Show
        Erick Erickson added a comment - Robert: You're right, of course. Thanks! I'll fix momentarily.
        Hide
        Erick Erickson added a comment -

        Thanks Areek!

        Show
        Erick Erickson added a comment - Thanks Areek!
        Hide
        Areek Zillur added a comment -

        Thanks Erick and Robert for committing this.

        Show
        Areek Zillur added a comment - Thanks Erick and Robert for committing this.
        Hide
        Erick Erickson added a comment -

        Thank you for the effort!

        Show
        Erick Erickson added a comment - Thank you for the effort!

          People

          • Assignee:
            Unassigned
            Reporter:
            Areek Zillur
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development